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Abstract. Classical ecological theory predicts that environmental stochasticity increases extinction risk by re- 
ducing the average per-capita growth rate of populations. For sedentary populations in a spatially homogeneous 
yet temporally variable environment, a simple model of population growth is a stochastic differential equation 
dZt = fiZtdt + aZtdWt, t > 0, where the conditional law of Zt-\.^t ~ given Zt = z has mean and variance 
approximately z^At and z^a'^At when the time increment At is small. The long-term stochastic growth rate 
limt_»oo t logZt for such a population equals /i — Most populations, however, experience spatial as well as 

temporal variability. To understand the interactive effects of environmental stochasticity, spatial heterogeneity, and 
dispersal on population growth, we study an analogous model Xt = {X^ , . . . ,X"), t > 0, for the population abun- 
dances in n patches: the conditional law of Ji^t+At given Xt = x is such that the conditional mean of X^_^_^^ — XI is 
approximately [x'/x^ -|- ^ji^-' Dji — Dij)]At where fii is the per capita growth rate in the j-th patch and Dij is the 
dispersal rate from the i-th patch to the j-th patch, and the conditional covariance of X^_i_^^ — XI and X^_^^^ — X-j. 
is approximately x'x^ <TijAt for some covariance matrix S = (o-jj). We show for such a spatially extended popula- 
tion that if St = X^ + ■ ■ ■ + X" denotes the total population abundance, then Yt = Xt/St, the vector of patch 
proportions, converges in law to a random vector Yoo as t — > oo, and the stochastic growth rate limt^oo log 5t 
equals the space-time average per-capita growth rate ^iE[Yj^] experienced by the population minus half of the 
space-time average temporal variation E[J^^ ^ aijY^Y^] experienced by the population. Using this characterization 
of the stochastic growth rate, we derive an explicit expression for the stochastic growth rate for populations living 
in two patches, determine which choices of the dispersal matrix D produce the maximal stochastic growth rate for 
a freely dispersing population, derive an analytic approximation of the stochastic growth rate for dispersal limited 
populations, and use group theoretic techniques to approximate the stochastic growth rate for populations living in 
multi-scale landscapes (e.g. insects on plants in meadows on islands). Our results provide fundamental insights into 
"ideal free" movement in the face of uncertainty, the persistence of coupled sink populations, the evolution of dispersal 
rates, and the single large or several small (SLOSS) debate in conservation biology. For example, our analysis implies 
that even in the absence of density-dependent feedbacks, ideal-free dispersers occupy multiple patches in spatially 
heterogeneous environments provided environmental fluctuations are sufficiently strong and sufficiently weakly cor- 
related across space. In contrast, for diffusively dispersing populations living in similar environments, intermediate 
dispersal rates maximize their stochastic growth rate. 

stochastic population growth, spatial and temporal heterogeneity, dominant Lyapunov exponent, ideal free move- 
ment, evolution of dispersal, single large or several small debate, habitat fragmentation 



1. Introduction 

Environmental conditions (e.g. light, precipitation, nutrient availability) vary in space and time. Since these con- 
ditions influence survivorship and fecundity of an organism, all organisms whether they be plants, animals, or viruses 
are faced with a fundamental quandary of "Should I stay or should I go?" On the one hand, if individuals disperse 
in a spatially heterogeneous environment, then they may arrive in locations with poorer environmental conditions. 
On the other hand, if individuals do not disperse, then they may fare poorly due to temporal fluctuations in local 
environmental conditions. The consequences of this interaction between dispersal and environmental heterogeneity 
or population growth has been studied extensively from theoretical, experimental, and applied perspectives 
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Boyce et al. 2006 Matthews and Gonzalez 2007 Schreiber 2010 Durrett and Remenik in press . Here, we pro 



vide a mathematically rigorous perspective on these interactive effects using spatially explicit models of stochastic 
population growth. 

Population growth is inherently stochastic due to numerous unpredictable causes. For a single, unstructured 
population with overlapping generations, the simplest model accounting for these fluctuations is a linear stochastic 
differential equation of the form 



(1) 



dZt = fiZtdt + aZtdBf, 



where Zt is the population abundance at time t, fi is the mean per-capita growth rate (that is, E[Zt+At — Zt \ Zt = 
z] w zfiAt), is the "infinitesimal" variance of fluctuations in the per-capita growth rate (that is, E[(Zt_(_At — Zt — 
zfj.At)'^ \ Zt = z\ ~ z'^a'^A.t), and Bt is a standard Brownian motion. Equivalently, the log population abundance 
log Zt is normally distributed with mean log Zq + — a"^ /2)t and variance a^t. Hence, even if the mean per-capita 
growth rate /i is positive these populations decline exponentially towards extinction when tT^/2 > ^ due to the 
predominance of the stochastic fluctuations. Despite its simplicity, the mod el ([T|) i s used extensively for projecting 
future population sizes and estimating extinction risk Dennis et al. 1991] Foley 1994 Lande et al. 2003 . For 
example. 



Dennis et al. 1991 



estimated ^jl and a for six endangered species. These estimates provided a favorable 
outlook for the continued recovery of the Whooping Crane (i.e. /i ^ cr^/2), but unfavorable prospects for the 
Yellowstone Grizzly Bear. 

Individuals cannot avoid being subject to temporal heterogeneity, but it is only when they disperse that they are 
affected by spatial variation in the environment. The effect of spatial heterogeneity on population growth depends. 



intuitively, on how individuals respond to environmental cues Hastings 
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movement is towards regions with superior habitat quality, the presence of spatial heterogeneity increases the rate of 
population growth Chesson 2000 Schreiber and Lloyd-Smith 2009 . The most extreme form of this phenomenon 



occurs when individuals are able to disperse freely and ideally; that is, they can move instantly to the locations 
that maximize their per-capita growth rate Fretwell and Lucas 1970 Cantrell et al. 2007 . Anthropogenically 



altered habitats, however, can cause a disassociation between cues used by organisms to assess habitat quality and 
the actual habitat quality. This disassociation can result in negative associations between movement patterns and 



habitat quality and a corresponding reduction in the rate of population growth Remes 2000 Delibes et al. 2001 
Schreiber and Lloyd-Smitti] [2009 . For "random diffusive movement" (that is, no association between movement 
patterns and habitat quality) , spatial heterogeneity increases population growth rates due to the influence of patches 
of higher quality. However, this boost in growth rate is most potent for sedentary populations tHastingsl [1983 



Dockery et al. 1998[ Kirkland et al. 2006 Schreiber and Saltzman 2009 . This dilutionary effect of dispersal 



on population growth was observed in the invasion of a woody weed. Mimosa pigra, into the wetlands of tropical 
Australia 



Lonsdale 1993 



A relatively fast disperser, this weed had a population doubling time of 1.2 years on 
favorable patches, but it exhibited much slower growth at the regional scale (doubling time of 6.7 years) due to the 
separation of suitable wetland habitats by unsuitable eucalyptus savannas. 

Despite these substantial analytic advances in understanding separately the effects of spatial and temporal het- 
erogeneity on population growth, there are few analytic studies that consider the combined effects. For well-mixed 
populations with non-overlapping generations living in patchy environments, Metz et al. [1983] showed that pop- 
ulation growth is determined by the geometric mean in time of the spatially (arithmetically) averaged per-capita 
growth rates. A surprising consequence of this expression is that populations coupled by dispersal can persist even 



though they are extinction prone in every patch Jansen and Yoshimura[ 1998|. T his "rescue effect", however, only 



occurs when spatial correlations are sufficiently weak Harrison and Quinn 1989 . Schreiber 2010 extended these 
results by deriving an analytic approximation for stochastic growth rates for partially mixing populations. This 
approximation reveals that positive temporal correlations can inflate population growth rates at intermediate disper- 



sal rates, a conclusion consistent with simulation and empirical studies Roy et al. 2005 Matthews and Gonzalez 



2007 . For example, Matthews and Gonzalez 2007 manipulated metapopulations of Paramecium aurelia by vary- 
ing spatial-temporal patterns of temperature. In spatially uncorrelated environments, the populations coupled by 
dispersal always persisted for the duration of the experiment, while some of the uncoupled populations went ex- 
tinct. Moreover, metapopulations experiencing positive temporal correlations exhibited higher growth rates than 
metapopulations living in temporally uncorrelated environments. 



59 Here, we introduce and analyze stochastic models of populations that continuously experience uncertainty in time 

60 and space. For these models, our analysis answers some fundamental questions in population biology such as: 

61 • How is the long-term spatial distribution of a population related to its rate of growth? 

62 • When are population growth rates maximized at low, high, or intermediate dispersal rates for populations 

63 exhibiting diffusive movement? 

64 • What is ideal free movement for individuals constantly facing uncertainty about local environmental condi- 

65 tions? 

66 • To what extent do spatial correlations in temporal fluctuations hamper population persistence? 

67 • How do multiple spatial scales of environmental heterogeneity influence population persistence? 

68 In Section [2] we introduce our model for population growth in a patchy environment. It describes temporal 

69 fluctuations in the qualities of the various patches using multivariate Brownian motions with correlated components. 

70 In Section [3] wc first consider the vector- valued stochastic process given by the proportions of the population in 



71 each patch. These proportions converge in distribution to a (random) equilibrium at large times. The probability 

72 that this equilibrium spatial distribution is in some given subset of the set of possible patch proportions is just the 

73 long-term average amount of time that the process spends in that subset. We derive a simple expression for the 

74 stochastic growth of the population in terms of the first and second moments of this equilibrium spatial distribution. 

75 We also show that this equilibrium spatial distribution is characterized by a solution of a PDE that we solve in the 

76 case of two patches and use to examine how the equilibrium spatial distribution depends on the dispersal mechanism. 

77 We then present some numerical simulations to give a first indication of the interesting range of phenomena that can 

78 occur when there is spatial heterogeneity in per-capita growth rates and biased movement between patches. 



79 We use the results from Section [3] in Section [2] to investigate ideal free dispersal in stochastic environments. That 

80 is, we determine which forms of dispersal maximize the stochastic growth rate for given mean per-capita growth 

81 rates in each of the patches and given infinitesimal covariances for their temporal fluctuations. 

82 We consider the effect of constraints on dispersal in Section [5] We suppose that the dispersal rates are fixed up 

83 to a scalar multiple S and establish an analytic approximation for the stochastic growth rate of the form a -\- b/S for 

84 large S. We use this approximation to give criteria for whether low, intermediate, or high dispersal rates maximize 

85 the stochastic growth rate. In particular, we combine this analysis with tools from group representation theory to 

86 obtain results on the stochastic growth rate for environments with multiple spatial scales. 

87 We discuss how our results relate to existing literature in Section |6] We end with a collection of Appendices 

88 where, for the sake of streamlining the presentation of our results in the remainder of the paper, we collect most of 

89 the proofs. 

90 2. The Model 

91 We consider a population with overlapping generations living in a spatially heterogeneous environment consisting 



92 of n distinct patches and suppose that the per-capita growth rates within each patch are determined by a mixture of 

93 deterministic and stochastic environmental inputs. Let Xj: denote the abundance of the population in the z-th patch 

94 at time t and write Xt — {X^, . . . , X")-^ for the resulting column vector (we will use the superscript T throughout to 

95 denote the transpose of a vector or a matrix). If there was no dispersal between patches, it is appropriate to model 

96 X as a Markov process with the following specifications for At small: 

E[Xl^^^-Xl\Xt^x]^fi,x'At, 

97 where fii is the mean per-capita growth rate in patch z, and 

Cov[X,Va* - XI, Xl^^, ~ XI I X, = x] « a,,x'x^At, 

98 where S — {uij) is a covariance matrix that captures the spatial dependence between the temporal fluctuations in 

99 patch quality. More formally, we consider the system of stochastic differential equations of the form 

dXl = XI [ii.dt + dE]) , 

100 where Et = F-^Bj, F is an n x n matrix such that F-^F — S, and Bt — {B}, . . . , i?")^, i > 0, is a vector of independent 

101 standard Brownian motions. 

102 In order to incorporate dispersal that couples the dynamics between patches, let Dij > for j ^ ihe the per-capita 

103 rate at which the population in patch i disperses to patch j. Define —Da := '^j^.^Dij to be the total per-capita 

104 immigration rate out of patch i. The resulting matrix D has zero row sums and non-negative off-diagonal entries. We 
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105 call such matrices dispersal matrices. It is worth noting that any dispersal matrix D can be viewed as a generator of a 

106 continuous time Markov chain; that is, if we write Pt :— exp(iD) for t > 0, so that Pt, t > 0, solves the matrix-valued 

107 ODE 

108 then the matrix Pt has nonnegative entries, its rows sum to one, and the Chapman-Kolmogorov relations PgPt = Ps+t 

109 hold for all s,t > 0. The (z, j)-th entry of Pt gives the proportion of the population that was originally in patch i at 

110 time but has dispersed to patch j at time t. 

111 Adding dispersal to the regional dynamics leads to the system of stochastic differential equations 

n 

(2) dXi = Xlifi.dt + dEl) + ^ DjiX{dt. 

112 We can write this system more compactly as the vector-valued stochastic differential equation 

dXt = diag(Xt) {fidt + dE() + D'^Xt dt 
= diag(XO {fidt + r^rfBf) + D'^Xt dt, 



113 where fj, := (/ii, . . . , ^,nf" , and, given a vector u, we write diag(u) for the diagonal matrix that has the entries of u 

114 along the diagonal. 

115 We implicitly assume in the above set-up that all dispersing individuals arrive in some patch on the landscape. 

116 To account for dispersal induced mortality, we can add fictitious patches in which dispersing individuals enter and 

117 experience a mortality rate before dispersing to their final destination. 

118 Also, our model does not include density-dependent effects on population growth. However, one can view it as a 

119 linearization of a density-dependent model about the extinction equilibrium (0, . . . , 0)"^ and, therefore, ([s]) determines 

120 how the population grows when abundances are low. Moreover, for discrete-time analogues of our model, positive 

121 population growth for this linearization implies persistence in the sense that there exists a unique positive stationary 

122 distribution for corresponding models with compensating density-dependence [Benaim and Schreiber 2009 . We 

123 conjecture that the same conclusion holds for our continuous time model. 

124 From now on we assume that the dispersal matrix D is irreducible (that is, that it can not be put into block 

125 upper-triangular form by a re-labeling of the patches). This is equivalent to assuming that the entries of the matrix 

126 Pt = exp(tD) are strictly positive for all t > 0, and so it is possible to disperse between any two patches. Also, we 

127 will assume that the covariance matrix S has full rank (that is, that it is non-singular). This assumption implies 

128 that the randomness in the temporal fluctuations is genuinely n-dimensional. 

129 3. The stable patch distribution and stochastic growth rate 

130 3.1. Stable patch distribution. The key to understanding the asymptotic stochastic growth rate of the population 

131 is to first examine the dynamics of the spatial distribution of the population. Let St := Xf + • • ■ + AT" denote the 

132 total population abundance at time t and write Yt := XI/ St for the proportion of the total population that is in 

133 patch i. Set :— (Y^ , . . . ,Y^)'^ . The stochastic process Y takes values in the probability simplex A := {y S M" : 

13'* = 1: 2/i > 0}- 

135 The following proposition, proved in Appendix |Xj shows that the stochastic process Y is autonomously Markov; 

136 that is, that its evolution dynamics are governed by a stochastic differential equation that does not involve the total 

137 population size. Moreover, it says that the law of the random vector Yt converges to a unique equilibrium as t — >■ oo. 

138 Recall, the law of a random vector Y € K" is the probability measure /xy on E" defined by /iY(^) = P{Y G A] for 

139 all Borel sets A C M". Moreover, for any /iv-integrable function h : M" M, the expectation of h{Y) is defined by 

E[/i(Y)] = j h{y)^i^{dy). 

140 A sequence of random vectors Yi , Y2 , . . . converges in law to a random vector Yqo if 

lim E[/i(Y„)] = E[/i(Yoo)] 

n— J-oo 

141 for every continuous, bounded function h : M" — > M. Convergence in law of a sequence of random vectors is also 

142 called convergence in distribution of the random vectors and is equivalent to weak convergence of their laws. 
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143 Proposition 3.1. Suppose that Xq 7^ 0. Then, the stochastic process Y satisfies the stochastic differential equation 
(4) dYt = (diag(Yt) - YtYf) T^dBt + D^Ytdt + (diag(Yt) - YtYf) {p - EY*) dt. 

144 Moreover, there exists a random variable Yoo taking values in the probability simplex A such that Yt converges in 

145 law to Yqo as t —)■ 00 and such that the empirical measure XIj := ^ ds converges almost surely to the law of 

146 Yoo as t — )■ cx) . The law of Yoo does not depend on Xq ■ 

147 The empirical probability measure lit appearing in Proposition |3.1| describes the proportions of the time interval 

148 [Q,t] that the process Y spends in the various subsets of its state space A. Namely, for a Borel set yl C A of patch 

149 occupancy states, nf(^) equals the fraction of time spent in these states over the time interval [0,t]. For example, if 

150 A = {y G A : yi > 1/2}, then nf(yl) equals the fraction of time for which at least 50% of the population is in patch 

151 1 during the time interval [0,t]. 

152 3.2. Stochastic growth rates. Recall that St — X} + • • • + X" is the total population size at time t. That is, 

153 St = l"^Xt, where 1 = (1, . . . , 1)"'". Because D\ = 0, it follows from (|3| that 

dSt = Xjr'^dBt + fi'^Xtdt = StYfr'^dBt + Stfi'^Ytdt. 



154 Therefore, by Ito's lemma Gardiner, 2004 



log St = So + I Yfr^dBt+ I ^FYtdt~ \ I Yfr^vYtdt. 

Jo JQ ^ Jo 

155 Dividing by t, taking the limit as i — > 00, and applying Proposition |3 . 1 1 yields the following result. 

156 Theorem 3.2. Suppose that Xq 7^ 0. Then, 

(5) X lim t'^ log St = M^E[Yoo] - [Y^SYool almost surely, 

t— >oo 2 

157 where Yoo *s described in Proposition\3 . 1\ 



158 The limit % in ([s]) is generally known as the Lyapunov exponent for the Markov process X. Following Tuljapurkar 



1990 , we also call x the stochastic growth rate of the population, as it describes the asymptotic growth rate of the 

160 population in the presence of stochasticity. To interpret (|5|, notice that 

(6) (/i) := /i^E[Yoo] = = lim ^ A*,E[y/] 

i i 

161 corresponds to weighted average of the per-capita growth rates with respect to the long-term spatial distribution 

162 Yoo of the population. To interpret the other component of let Var[X] denote the variance of a random variable 

163 X. Since J^i^ti-^t+At ~ -^t) for small At > is approximately the average environmental change experienced by 

164 the population over time interval [t, i + A] , 



(7) {a') - E [Y^EYoo] = lim -^Var [YJ {^t+At - E*)] = hm -J-Var 



for any Ai > 



165 corresponds to the infinitesimal variance of the environmental fluctuations weighted by the long-term spatial distri- 

166 bution. 



167 Biological interpretation of Theorem 3.2, The stochastic growth rate (/i) — ((t^)/2 for a spatially structured 



168 population is just what we see for an unstructured population where (/i) and (cr^) are the per-capita growth rate and 

169 the infinitesimal covariances of the temporal fluctuations averaged appropriately with respect to the equilibrium spatial 

170 distribution. Hence, as in a spatially homogeneous environment, environmental fluctuations reduce the population 

171 growth rate. However, as we show in greater detail below, interactions between dispersal patterns, spatial heterogeneity 

172 and environmental fluctuations may increase the stochastic growth rate by increasing (/i) or decreasing (ct^). 



173 To get a more explicit expression for the stochastic growth rate, we need to determine the distribution of the 

174 equilibrium Yoo, or at least find its first and second moments. This problem reduces to solving for the time- invariant 

5 




Figure 1. Spatial distribution and population growth in a two patch environment. In (a), the 
stochastic growth rate x is plotted as a function of the dispersal rate 5. In (b), the stationary 
density of the fraction of individuals in patch 1 is plotted for different dispersal rates. Parameter 
values are jii = ^2 = 0.3, ai = U2 — 1, and D12 — D21 — S. 
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solution of the Fokker-Planck equations with appropriate boundary conditions Gardiner 2004 , Namely, the density 
p : A — ^ [0, 00) of Yoo satisfies 

1 ^ 52 



(8) 



E 



dy, 



AUy)p{y) + - 



2 ^ dy.dy, 



v^Ay)p{y) 



for y e A, 



177 where Mi and Vij are the entries of 
M{y)^D'^y+{Amg{y) 



yy 



{fi ~ Ey) and V{y) = (diag(y) 



yy 



r^r (diag(y) 



yy 



respectively, and p is constrained to have p{y)dy = 1. However, the PDE (|8| needs to be supplemented with 
appropriate boundary conditions. In principle, these are found by characterizing the domain of the infinitesimal 
generator of the Feller diffusion process Y and thence characterizing the domain of the adjoint of this operator 
Khas'minskii 1960 Bhattacharya 1978 Bogachev et al. 2002 2009 . This appears to be a quite difficult problem. 



However, in the case of two patches, the problem simplifies to solving an ODE on the unit interval. 

Example 3.1 Stochastic growth in two patch environments. Assume there are two patches. For simplicity, 
suppose there are no environmental correlations between the patches; that is, that an = af and aij = for i ^ j- 
Proposition 3.1 gives that Y^ — X} /{X} + Xf) satisfies the one-dimensional stochastic differential equation 



186 where 



dr/ = dt + y v;(i^t^) dBt 

M,{y) := y{l - y){pi - p2 - ^Iv + ^^(1 - v)) - -Dia?/ + -D2i(l - y) 
and 

V,{y) -.^y^l-yfiaf + al). 



187 We can then apply standard tools for one-dimensional diffusions Gardiner |2004| (checking that the boundaries at 

188 and 1 are "entrance" , and hence inaccessible) to find that the density p{x) : [0, 1] — > [0, oo) of is given by 



piy) 



C2 



exp ( 2 
exp 



K(y) 
2 



dy 



Ail - 



D 



12 



D2I 



- y)2 

= C3/-"Kl-y)-^-"^cxpf-^^f 

V '^1 ' '''2 V 

189 where the C'i are normalization constants, and 

2a? 



2/(1 ~y) i-y y yi^-y)^ 
D21 , D12 



y^(i - y) 



dy 



y 



'I -r U2 
2 

P ■■= 2 (A^I ^ ("2 + £'21 - £'12) 



190 Using this expression in ([s]), we get the following explicit expression for the stochastic growth rate 



Ml / yp{y)dy + fi2 {l-y)p(y)dy 



y^piy) dy ■ 



i^-yfpiy) dy 



P2 



+ {pi - P2 + cri) / yp{y) dy — - 



y^piy) dy. 



191 Despite its apparent complexity, this formula provides insights into how dispersal may influence population growth. 

192 For example, consider a population dispersing diffusively between statistically similar but uncorrelated patches (that 

193 is, D12 ~ D21 ~ 5/2, fii — fi2 = P, and ai —02 ~ a). We claim that the stochastic growth rate x is an increasing 

194 function of the dispersal rate 6. Intuitively, this occurs because increasing 6 decreases the variance of the random 

195 variable Yqo but has no effect on its expectation. 

196 To verify our claim that x is increasing with 6, write /)(•; S) for the density of to emphasize its dependence on 

197 6 and notice that in this case 



p{v\S) 



1 -I,, .-1 ( 5 



y) 



ye (0,1), 



198 where C{d) = Jq y ^(1 — y) ^ exp (^— 2a^y\i^y) ) dy is the normalization constant and 



(9) 



199 It suffices to show that 



X{5) =p- <J^/2 + ^2 / y(l - y) p{y- 5) dy 



dy 



\{l-y)p{y;25a')dy = ^ 



v) 







C(2(5cr2 

/o 



s 



dy 



Joy Hi - y) ^exp 



s 



dy 



200 is an increasing function of 5 > 0. Differentiating with respect to 6 and carrying the differentiation inside the integral 

201 sign, we obtain 

r-l 



C{2a'S)-' X 



/ y ^(1-y) ^cxpf — 7r—{] '^y^ I 
Jo \ y{^-y)J Jo 



exp 



y(i - y) 



dy 



f y-(i-y)-exp(--^ 



-y) 



dy 



202 This quantity is the variance of the random variable (Y^{1 — Y^)^ ^ and is thus nonnegative. 



203 For the purpose of comparison with general asymptotic approximations that we develop later, we note that after 

204 a change of variable 



208 



214 



5^5^) dy _ J-e-.,-i(2^+4)-fd, 



/o%-(l-y)-^exp(-,,^) dy re-z-n^+4)-^d.- 

205 Upon expanding the two functions w i— >■ (w + 4)^3 and w (w + 4)^5 in Taylor series around and integrating, 

206 we find that the ratio of integrals is of the form 

1 1.^ 



4 6 16 V 

207 as (5 — ?> cxD, so that 



(10) xiS) ~ M 



as (5 — > oo. 



6 16 



209 Approximation (10 1 implies, as we prove more generally in Proposition 4.1 that lim5_>.oo xi^) — M ^ (t /4 



210 Biological interpretation of Example Even if two patches are unable to sustain a population in the 

211 absence of dispersal, connecting the patches by dispersal can permit persistence. This phenomenon occurs only at 

212 intermediate levels of environmental stochasticity (i.e. 2/u < cr^ < 4/ij. Moreover, when this phenomenon occurs, 

213 there is a critical dispersal threshold 6* > such that the metapopulation decreases to extinction whenever its dispersal 
rate is too low (i.e. 6 < 5* ) and persists otherwise (Fig. [7]). 



215 Because there do not appear to be closed-form expressions for the law of the stable patch distribution Yqo when 

216 there are more than two patches, we must seek other routes to understanding the stochastic growth rate in such cases. 

217 One approach would be to solve the PDE ([s]) numerically. A second approach would be to simulate the stochastic 

218 process Y for long time intervals and derive approximate values for the first and second moments of the equilibrium 

219 distribution. To give an indication of the range of phenomena that can occur in even relatively simple systems where 

220 there is biased movement between patches, we adopt the even simpler approach of simulating the stochastic process 

221 X directly for long time intervals to obtain an approximate value of the stochastic growth rate. We implemented the 



222 simulations in a manner similar to that of Talay 1991 , and the R code used is provided as supplementary material. 



223 Example 3.2 Spatially heterogeneous environments with biased emigration. For these simulations, we 

224 consider a metapopulation with either n = 8 or n = 40 patches of which one quarter are higher quality (/i^ = 10 in 

225 these patches) and the remainder are lower quality (/ij — 1 in the remaining patches). All patches have the same 

226 level of spatially uncorrelated environmental noise ( an = 16 for all i and Uij = for i ^ j). When an organism exits 

227 a patch it chooses from the other patches with equal probability, but the emigration rate from a patch depends on 

228 the patch quality. 

229 First, we consider the case in which emigration is "adaptive" in the sense that individuals emigrate more rapidly 

230 out of lower quality patches than higher quality patches: 



(5, for i = 1, . . . , n/A and i ^ j, 

10 6, for i = n/A + 1, . . . , n and i ^ j. 



231 Here, the parameter (5 > scales the emigration rate, so that doubling 5 doubles the emigration rate from all patches. 

232 As expected, since in this case dispersal is "adaptive" , Figure [2] shows that stochastic growth rate x — xi.^) ^ ^ 

233 function of 5 increases with 5. Moreover, Figure [2] shows asymptotic values ai 5 — oo for each case, and illustrates 

234 that the analytic approximation developed later in Theorem |5.2| works reasonably well for large values of 5. The 

235 Figure also shows extremely slow convergence as (5 — >■ to x(0) = max^ /Zi — {\/2)af (note the logarithmic scale 

236 on the horizontal axis), indicating that although x is continuous at (5 = by Proposition |5 . 1 1 below, it may not be 

237 differentiable there. 
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Figure 2. The effect of dispersal rate 6 on populations emigrating more rapidly out of lower quality 
patches than higher quality patches. Shown is the stochastic growth rate x estimated from simulation 
of the SDE for 100 time units, across a range of values of 5, for both a 40-patch and a 8-patch model. 
Standard errors are estimated using the standard deviation of the stochastic growth rates across 
nonoverlapping time segments of a given simulation. Details of the dispersal matrix and parameter 
values are described in the main text. The right-hand axis shows asymptot ic va lues for (5 = and 
S = oo, which are: x(0) = maxi iii and x(oo) — fi^n — ^tt^Stt (Proposition 4.1). "High dispersal" 



shows the approximation of the form x((5) sa a + b/6 for large S calculated from formula (19 1 of 
Theorem 2. 



238 Next we consider a case in which emigration is "maladaptive" , in the sense that individuals emigrate more rapidly 

239 out of higher quality patches than out of lower quality patches: 

^ ( lOS, for i — 1, . . . , n/4 and i ^ j, 

'■^ 1 (5, for i = n/4+l,...,n and i 7^ j. 

240 It is possible to show using the results of Section [5] below that in this regime, high dispersal rates lead to a lower 

241 stochastic growth rate than sedentary populations (that is, lim5_>.oo x('5) is dominated by lim5_>.o x((5)), and yet xi^) 

242 increases with S when 5 is large. As illustrated in Figure [Sj the stochastic growth rate x{^) exhibits a rather complex 

243 dependence on 6: increasing at low dispersal rates, declining at higher dispersal rates, and finally increasing again 

244 at the highest dispersal rates. 

245 In a conservation framework, increasing S corresponds to facilitating movement between patches by increasing the 

246 size or number of dispersal corridors between patches. 

247 Biological interpretation of Example [372} For populations exhibiting adaptive movement, increasing the size 

248 or number of dispersal corridors between patches enhances metapopulation growth rates. For populations exhibiting 

249 maladaptive movement, however, increasing dispersal rates can either increase or decrease metapopulation growth 

250 rates. 
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Figure 3. The effect of dispersal rate 6 on populations emigrating more rapidly out of higher quality 
patches than lower quality patches. Details are as in Figure [2] but with different dispersal scheme; 
parameter values are described in the main text. 



251 4. Ideal free dispersal in a stochastic environment 

252 A basic quandary in evolutionary ecology is, "For a given set of environmental conditions, what dispersal pattern 

253 maximizes fitness?" Since fitness in our context corresponds to the stochastic growth rate of the population, we can 

254 Rephrase this question as, "Given ^ and S, what form of the dispersal matrix D maximizes x?" Following [Fretwell and| 



255 [Lucas 1970 , we call such an optimal dispersal mechanism ideal free dispersal as individuals have no constraints on 

256 their dispersal (i.e. are "free") and have complete knowledge about the distribution of spatial-temporal fluctuations 

257 (i.e. are "ideal"). 

258 Equation ^ provides a means to answer this question. Because S has full rank, the function y t-^ ^if^Tjy is 

259 strictly convex, and so Jensen's inequality implies that 

E[Y^SYoo] > E[Yoo]^SE[Yoo], 

260 with equality if and only if the random vector Yqo is almost surely constant. Hence, to maximize the stochastic 

261 growth rate x, we need to eliminate the variability in Yoo, so that Yqo = y almost surely for a constant y that is 

262 chosen to maximize 

(11) M^y-^2/^Ey 

263 subject to the constraint y G A. Under our standing non-degeneracy assumptions on D and S, the law of Yoo 

264 is supported on all of A, and so we cannot actually achieve a situation in which Yo^ is a constant. However, the 

265 following result, which we prove in Appendix [B] shows that we can approach this regime arbitrarily closely. Recall 

266 that the stationary distribution tt for an irreducible dispersal matrix Q is a probability vector tt e A such that 

267 TT^'^Q — 0. We note that any vector tt in the interior of A is the stationary distribution for some irreducible dispersal 

268 matrix Q. For example, given tt, we can define Q = Itt'^ — I where / denotes the identity matrix. 
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281 
282 
283 



Proposition 4.1. Consider a vector tt in the interior of A and an irreducible dispersal matrix Q that has tt as 
its unique stationary distribution. Let Yoo((5) be the equilibrium patch distribution and be the stochastic growth 
rate for |3|) with D ~ SQ. Then Yoo((5) converges in law to the constant vector tt as S ^ oo, and x(J) converges to 



jl^TT — ^TT^ S TT as 5 oo. 



In the absence of population growth due to deterministic or stochastic effects, each of the dispersal matrices 5Q 



in Proposition 4.1 sends the patch distribution to the vector tt regardless of the initial conditions, and the speed 
at which this happens increases with 5, so that it becomes effectively instantaneous for large S. Proposition |4.1| 
says that this push towards a deterministic equilibrium overcomes any disruptive effects introduced by population 
growth provided S is sufhciently large, and so it is possible to produce random equilibrium patch distributions that 
are arbitrarily close to any given vector tt in the interior of A. If we further approximate vectors tt on the boundary 
of A by ones in the interior, we see that it is possible to produce equilibrium patch distributions that are arbitrarily 
close to any given vector in A. 

Given that any patch distribution can be approximated arbitrary closely by the equilibrium patch distribution of a 
suitable population of rapidly dispersing individuals, the problem of optimizing x reduces, as we have already noted, 
to maximizing the strictly concave function g{y) = fjFy — ■^y^'Sy over the compact, convex set A. This concavity 
implies there exists at most one local maximum. Denote this unique maximizer by y* = (y^, . . . , y*)"^. 

It is optimal for all individuals to remain in the single patch k (that is, y^. = 1) only if 

dg 

(e/c) - (cfc) = tJLi- CTifc - Mfc + cTfe/c < for aU i ^ /c, 
oyt dyk 

where is the fc-th element of the standard basis of M 
(12) ^ik- tJ-i > (^kk 



or, equivalently, 
- (Tifc for all i ^ k. 



287 Biological interpretation of equation (12 1. If the variances of environmental fluctuations are sufficiently 



large in all patches and the spatial covariances in these environmental fluctuations are sufficiently small, then ideal 
free dispersers occupy multiple patches. 



290 

291 
292 
293 
294 



When it is optimal to disperse between several patches, we can solve for the optimal dispersal strategy y* by using 
the method of Lagrange multipliers. Without loss of generality, assume that the optimal strategy y* makes use of 
all patches, that is, that y* is in the interior of A. Indeed, if the optimal strategy does not make use of all patches, 
then we can consider analogous problems on the faces of the convex polytope A of the form {y E A : yi = 0, i E A}, 
where ^ is a subset of {1, . . . , n}. Because 



Vy(y) = /i - Sy and V y}j = 1, 



295 the optimal y* must satisfy 

(13) /^-I]y*=Al, 

296 where A is a Lagrange multiplier. Notice that 



297 Hence, we get the following interpretation. 



iEl^^,-El)J2y,iEi^^,-E: 



298 Biological interpretation of equation (13). Ideal free populations using multiple patches are distributed 



across the patches in such a way that the differences between the mean per- capita growth rates and the covariances 
between the within patch noise and the noise experienced on average by an individual are equal in all occupied patches. 
In particular, the local stochastic growth rates fii — an /2 need not be equal in all occupied patches. 



302 Now, 

(14) 



y* =I]-i(Ai-Al), 
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Figure 4. Effects of spatial correlations on the ideal free patch distribution in a 15 patch environ- 
ment. Per-capita growth rates fii are plotted in the top left. The ideal free patch distribution y* is 
plotted at three levels of spatial correlation p. Covariances are an = 2 and aij = 2p for i ^ j. 



303 and the constraint l^y — 1 yields 



304 so that 

(15) A 



1 = XI), 



305 and 

(16) y =S ^T^-ii 1 



306 The right-hand side of equation (16) is the optimal vector y* we seek, provided that it belongs to the interior of A. 

307 Otherwise, as we remarked above, we need to perform similar analyses on the faces of the simplex A. 

308 To illustrate the utility of this formula, we examine two special cases: when the environmental noise between 

309 patches is uncorrelated, and when the patches experience the same individual levels of noise but they are spatially 

310 correlated. 

311 Example 4.1 Spatially uncorrelated environments. Suppose that there are no spatial correlations in the 

312 environmental noise, so that S is a diagonal matrix with diagonal entries an = af. It follows from equation (16) 

12 



313 that the ideal free patch distribution is 



(17) 



314 provided that J2jipj ~ Mi)/"'! < 1 f^^' 



E- 



315 Biological interpretation of equation (17|. In the absence of spatial correlations in environmental fluc- 



316 tuations, ideal free dispersers visit all patches whenever the environmental variation is sufficiently great relative to 

317 differences in the mean per-capita growth rates. In particular, if all mean per-capita growth rates are equal, then the 

318 fraction of individuals in a patch is inversely proportional to the variation in temporal fluctuations in the patch; that 

319 ^s,y: = {l/af)/ij:,l/<J^)- 

320 Example 4.2 Spatially correlated environments. Suppose that the infinitesimal variance of the temporal fluc- 

321 tuations in each patch is cr^ and that the correlation between the fluctuations in any pair of patches is p. Thus, 

322 S = (7^(1 — p)I + cr^pJ, where J = 11^ is the matrix in which every entry is 1. Provided that — < p < 1, the 

323 matrix E is non-singular with inverse 



324 Denoting by /i = ^ ^ • pi the average across the patches of the mean per-capita growth rates, the optimal dispersal 

325 strategy is given by 



(18) 



a2(l-p) 



326 provided that yl > for all i. Notice that (18 1 agrees with (171 when p — and CFi — a 



327 Biological interpretation of equation (18). If environmental fluctuations have a sufficiently large variance 



328 
329 
330 
331 
332 



334 
335 
336 



343 
344 
345 
346 
347 



a , then ideal free dispersers visit all patches and spend more time in patches that support higher mean per-capita 
growth rates. Increasing the common spatial correlation p results in ideal free dispersers .spending more time in 
patches whose mean per-capita growth rate is greater than the average of the mean per-capita growth rates and less 
time in other patches (Fig.^. When the spatial correlations are sufficiently large, it is no longer optimal to disperse 
to the patches with lower mean per-capita growth rates (p = 0.5 and p = 0.95 m Fig. g). 

5. The effect of constraints on dispersal 

While the ideal free patch distribution is a useful idealization to investigate how organisms should disperse in 
the absence of constraints, organisms in the natural world have limits on their ability to disperse and to collect and 
interpret environmental information. Recall from Section |4] that if the optimal patch distribution y* for an ideal free 
disperser is in the interior of the probability simplex A, then, loosely speaking, the ideal free disperser achieves the 
maximal stochastic growth rate by using a strategy for which dispersal rate matrix is of the form D = 6Q, where 
Q is any irreducible dispersal matrix with {y*)'^Q = and 6 = oo. At the opposite extreme, if y* assigns all of its 
mass to a single patch, then an ideal free disperser never leaves that single most-favored patch. 

To get a better understanding of how constraints on dispersal influence population growth, we consider dispersal 
matrices of the form D = SQ, where S > and Q is a fixed irreducible dispersal matrix Q with a stationary distri- 
bution TT that is not necessarily the optimal patch distribution for an ideal free disperser in the given environmental 
conditions. We write x((5) for the stochastic growth rate of the population as a function of the dispersal parame- 
ter S and ask which choice of S maximizes xi^)- In particular, we are interested in conditions under which some 
intermediate 6 > maximizes t he s tochastic growth rate xi^)- 

We know from Proposition 4.1 that xi^) approaches tt^ p — ^tt'^TiTt as 5 -> oo. We therefore set x(oo) = 

2 

tt'^ p — ^tt'^J^tt. On the other hand, if there is no dispersal {6 — 0), then limf_>.co j log^f = Pi — ^ with probability 
one whenever Xq > 0, and so limi_>.oo j logS"* = Taaxi{pi — ^} whenever X'q > for all i. Hence, it is reasonable to 
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set x(0) = maxil/ii — ^}. The following result, which we prove in Appendix|C[ implies that the function S i— > x((5) 
is continuous on [0,cx)). 



352 Proposition 5.1. The function 6 i— > xi^) analytic on the interval (0, oo) and continuous at the point S — 0. 

353 One way to establish that xi^) is maximized for an intermediate value of S is to show that < x(°o) ^i^d 

354 that > x(oo) for all sufficiently large S. The following theorem provides an asymptotic approximation for xi^) 

355 when S is large that allows us to check when the latter condition holds. We prove the theorem under the hypothesis 

356 that the dispersal matrix Q is reversible with respect to its stationary distribution tt; that is, that iiiQij = T^jQji for 

357 all i,j. Reversibility implies that at stationarity the Markov chain defined by Q exhibits "balanced dispersal in the 

358 absence of local demography." Namely, if a large number of individuals are independently executing the equilibrium 

359 movement dynamics, then the rate at which individuals move from patch i to patch j equals the rate at which 

360 individuals move from patch j to patch i. We note that diffusive movement (that is, the matrix Q is symmetric) 

361 and any form of movement along a one-dimensional landscape (that is, the matrix Q is tridiagonal) are examples of 

362 reversible Markov chains. We provide a proof of the theorem in Appendix [Dj Corollary |5 . 3| below, which we prove in 

363 Appendix [Ej provides a more readily computable expression for the asymptotics of the stochastic growth rate under 

364 further assumptions. 

365 Theorem 5.2. Suppose that Q is reversible with respect to its stationary distribution tt. Then, 



xiS) 



T 1 Tv^ 
/i TT — -TT ZjTT 



(/i — YiTt)'^!^ 



(19) 



/•OO 

/ Tr (exp(Q'^s) (diag( tt) — TTTT ) S (diag(7r) — tttt ) exp((5s)S) ds 
Jo 



+ 0(5-1) 



as S ^ oo, where v is the unique vector satisfying 1^ v — and v — — (diag(7r) — tttt"^) (/i — Stt). 



When the dispersal matrix D = SQ is consistent with ideal dispersal in t he limit S ~ 
that (/i — I]tt)'^i' = Xl^i/ = 0. On the other hand, the proof of Theorem 5.2 shows that 



oo, equation (13 1 implies 



/•oo 

/ Tr (exp(Q^s) (diag(7r) - nn'^) E (diag(7r) - tttt^) exp(gs)i;) ds ^ Tr 
Jo 



> 



where Voo is a Gaussian random vector. Hence, as expected, xi^) is an increasing function for large S when tt 
corresponds to the ideal free distribution associated with fi and E. However, when n does not correspond to the 
ideal free distribution, x((5) may be increasing or decreasing for large S as we illustrate below. 
When Q and E commute, the asymptotic expression (19) for x((5) simplifies a great deal. 

Corollary 5.3. Suppose that Q is symmetric and QE — T,Q. Let Ai < . . . < A„_i < A„ be the eigenvalues of Q 
with corresponding orthonormal eigenvectors ^i, . . . ,^„. Then, the eigenvalues ^i, . . . o/E can be ordered so that 



(20) 



%ik, for each 1 < k < n, and the approximation (|19p reduces to 



1 ^ 
2n 



1 

6n 



n— 1 



lk=l 



1 ^ 

4ri 



+ 0(<5-5/4) 



IS 5 — > oo, where /i = ;^ X] Mi- 

To illustrate the utility of this latter approximation, we develop more explicit formulas for three scenarios: diffusive 



movement in a landscape where all patches are equally connected (that is, a classic "Levins" style landscape Levins 
1969| ), diffusive movement in a landscape consisting of a ring of patches, and diffusive movement in a landscape with 



multiple spatial scales (that is, a hierarchical Levins landscape). 

Example 5.1 Fully connected metapopulations with unbiased movement. Consider a population in which 
individuals disperse at the same pcr-capita rate 6/n between all pairs of patches. Let cr^ be the variance of the 
within patch fluctuations and p be the correlation in these fluctuations between any pair of patches. Under these 
assumptions, the dispersal matrix is Q = J/n — I and the environmental covariance matrix is E = {1 — p)u^ I -\- pa'^ J , 
where recall that J — is the matrix of all ones. Because Q is symmetric, the stationary distribution of Q is 
uniform; that is, tti = • • • = 7r„ — ^. Hence, in the absence of population growth there would be equal numbers of 
individuals in each patch at large times. 
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388 Because the matrices / and J commute, the matrices Q and S also commute. Recall the notation of Corollary |5.3| 

389 The eigenvector ^„ is If £, is any vector of length one orthogonal to then = 0, and so = — ^ and 

390 = (1 — p)cr'^£,. We may thus take ^i, . . . , to be any orthonormal set of vectors orthogonal to Moreover, 

391 Ai = • • • = A,i_i = —1 and 9i = ■ ■ ■ = 0n-i = (1 — p)<j'^- 

392 Now, (^^/i)^ = (!/"-) (Z]fc=i /^fe)^ = and so Parseval's identity implies that YlkZii^I t^)"^ = Ylk=it4: ~ 

393 niP-Y = M"'"/^ ^ '^(m)^- Denote the variance of the vector /i by 



Var[/x] 



T 



n— 1 
rj ^- — ^ 



394 Substituting these observations into equation (|20|), we get that 



(21) 



x(<5)-M-^(l + (n-l)p) + ^ 



Var[^] 



4n2 



395 Recall that for the special case of two uncorrelated patches with D12 ~ D21 

396 we showed from our exact formula for xi^) in the two patch case that 



+ 0(^1). 
5/2, pi ^ p2 ^ P, and cti = 0-2 = 



(7 

T 



(5 16 



397 as (5 — > 00, see (10). Hence, this approximation agrees with (21). 

398 



Approximation (21) implies that x((5) is decreasing for large 6 whenever 



(22) 



n 



VVar[/i] 



> 



399 and that xi^) is increasing if the opposite inequality holds. We have remarked that, in general, an intermediate 

400 dispersal rate is optimal when x(0) < x(oo) and > x(oo) for all sufficiently large S. This will occur for 

401 individuals in this diffusive dispersal regime when 

(1 — p)cr^ ^ maxi Pi ~ p. 



(23) 



> 



1 - 1/n 



402 and (22 1 holds. In particular, when there are many patches (that is, n — )■ 00), inequalities (23 1 and (22) are both 

403 satisfied if 

(1 — p)(T^/2 > max Pi — p > 0. 



404 Biological interpretation of equations (22) and (23). Highly dijfusive movement has a negative impact on 

405 population growth whenever there are sufficiently many patches and there is sufficient spatial variation in the mean 

406 per-capita growth rates. Alternatively, if there is no spatial variation in the mean per-capita rates and stochastic 

407 fluctuations are not perfectly correlated, then the population growth rate continually increases with higher dispersal 

408 rates. This latter observation is consistent with individuals being distributed equally across the landscape is the 

409 optimal patch distribution. In contrast, if there is some spatial variation in the mean per-capita growth rates and 

410 there are .sufficiently large, but not perfectly correlated environmental fluctuations, then an intermediate dispersal rate 

411 maximizes the stochastic growth rate for diffusively dispersing populations. 



412 In order to apply Corollary |5.3[ we need to to simultaneously diagonalize the matrices Q and E. A situation 

413 in which this is possible and the resulting formulas provide insight into biologically relevant scenarios is when the 

414 dispersal mechanism and the covariance structure of the noise both exhibit the symmetries of an underlying group. 

415 Example |5.1| above is a particular instance of this situation. 

416 More specifically, we suppose that the patches can be labeled with the elements of a finite group G in such a 

417 way that the migration rate Qg,h and environmental covariance Eg,/i between patches g and h both only depend 

418 on the "displacement" gh~^ from 5 to /i in G. That is, we assume there exist functions q and s on G such that 

419 Qgh = <l{gh~^) and Sg^ = s{gh~^). For instance, if G is the group of integers modulo n, then the habitat has n 

420 patches arranged in a circle, and the dispersal rate and environmental covariance between two patches only depends 

421 on the distance between them, measured in steps around the circle. We do not require that the vector p of mean 

422 per-capita growth rates satisfies any symmetry conditions. 
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423 The matrices Q and E will commute if q and s are class functions, that is, if q{gh) = q{hg) and s{gh) = s{hg) for 

424 all g,h G G. We assume this condition holds from now on. Note that if G is Abelian (that is, the group operation is 

425 commutative), then any function is a class function. 

426 5.1. Background on group representations. We now record a few facts about representation theory, the tool 

427 that will enable us to find the eigenvalues and eigenvectors of Q and S, resulting in Theorem |5.4| We refer readers 



430 
431 
432 
433 



428 interested in more detail to Serre 1977 Diaconis 1988 , while readers interested in less mathematical detail may 

429 skip directly to Examples |5.2| and 5.3 without loss of continuity. 



A unitary representation of a group G is a homomorphism p from G into the group of dp x dp unitary matrices, 
where dp is called the degree of the representation. Two representations p' and p" are equivalent if there exists a 
unitary matrix U such that p"{g) — U p'{g)U~^ for all g G G. A representation p' is irreducible if it is not equivalent 
to some representation p" for which p"{g) is of the same block diagonal form for all g G G. A finite group has a 

434 finite set of inequivalent, irreducible, unitary representations, which we denote by G. The simplest representation is 

435 the trivial representation pf^j. of degree one, for which ptr(.9) ~ 1 ^^r all g. 

436 For a simple example that we will return to, let G = Z„, the group of integers modulo n. Since Z„ is Abelian, all the 

437 irreducible representations are one-dimensional {dp = 1 for all p G G), and are of the form p''™'\j) — exp(27rimj/n), 

438 so that G = {p^^ : < m < n — 1}. 

439 The matrix entries of irreducible representations are orthogonal: for p', p" G G, 



if p' = p" and {i,j)^{k,l), 
otherwise. 



(24) Y.p'^MPMi9r^ 

440 where z* denotes the complex conjugate of a complex number z, and ^^G is the number of elements of G. 

441 The Fourier transform of a function / ; G — !■ C is a function f on G defined by 

(25) fip)--^Y.f^a)pi9) for peG. 

geG 

442 Note that /(p) is a dp x dp matrix. It follows from the orthogonality properties of the matrix entries of the irre- 

443 ducible representations recorded above that the Fourier transform may be inverted, giving / explicitly as the linear 

444 combination of matrix entries of /. The inversion formula is 

peG 

445 For G = Z„, this is the familiar discrete Fourier transform, for which orthogonality of matrix entries is the fact 

446 that {l/n)'^^^Q exp{2TTij{£ — m)/n) = Sgm. The transform is given by /(p^™-*) = ^^?^g /(/c) exp(27rir7ifc/n) for 

447 < m < n — 1, and f{k) = {^/n) X^m^o /(p'"™'') exp(— 27rimA:/n). The trivial character is k^j. = p(°). 

448 Associated with a representation p S G is its character n, defined by K{g) := Trp(g). We write G for the set of 

449 characters of irreducible representations. The characters are class functions, and form an orthogonal basis for the 

450 subspace of class functions on G and all have the same norm: X^^gg I'*(3)P ~ where \z\ = \fz^ is the modulus 

451 of the complex number z. For p G G with character k G G, the Fourier transform of a class function / satisfies 



dp 



where / is the dp x dp identity matrix and 



(26) /(«) ^ figMg). 

geG 

453 Consequently, 

(27) /(5) = ^ E/^(5)7». 

454 As noted above, if G = Z„ then all irreducible representations are one-dimensional, so in this case we may identify 

455 the characters with the irreducible representations, G — G. Since Z„ is Abelian, all functions on Z„ are class 

456 fimctions, so that the two Fourier transforms (25) and (26) are equal. 
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457 Finally, given a function / on G and character k, define 



458 The following theorem is proved in Appendix [F] 

459 Theorem 5.4. Suppose that the n patches are labeled by a finite group G in such a way that Qgh = q{gh^^) and 

460 Sg/j = s{gh~^), where q and s are class functions. Suppose further that q{g) = qig^^), g G G, so that the matrix Q 

461 is symmetric. Let p, = SgGG/^(ff) '^'^'^ ^ ~ 'fa SgeG ^id)- Then, 

KeG\{«^j.} 



464 



483 



462 as 



S — > oo. Furthermore, q{K) < for all k Cz G \ {^tr}- 



463 Roughly speaking, this expression tells us about the respective roles of variance of patch quality (/i) and covariance 



of environmental noise (s). The fact that q{K) is negative for all k leads to the following. 



465 Biological interpretation of equation (28 I. // variability in patch quality at a certain scale is larger than 

466 the correlation in environmental noise at that scale, in a sense made precise above, then the stochastic growth rate 

467 decreases with increasing dispersal rates at that scale. Conversely, if environmental noise is strongly correlated 

468 between patches and the mean patch quality is similar, then more dispersal is expected to be better. The relevant 

469 sense of "at that scale" is in the sense of the Fourier transform, analogous to the "frequency domain" in Fourier 

470 analysis. 

471 Example 5.2 Circle of Patches. Suppose that the n patches of a habitat are arranged in a circle and are labeled 

472 by Z„ — {0, 1, .... n — 1}, the group of integers modulo n with identity element 0. As reviewed above, the Fourier 

473 transform is the familiar discrete Fourier transform. 

474 If we assume that individuals disperse only to neighboring patches and these dispersal rates are equal, then 

475 q(l) = q{n — 1) = 1/2, q{0) = —1 and q(2) — . . . — q{n — 2) = 0. Assume the environmental noise is independent 

476 between patches and has variance i.e. s(0) = ct^ and = s(l) = . . . = s(n — 1). Finally, suppose that patch 

477 quality as measured by the average per-capita growth rates is spatially periodic, so that /i(fc) = /I + c cos(27rfci?/n) 

478 for some c > 0, /l, and 1 < £ < n/2. 

479 Under this set of assumptions, we can compute that for to 7^ 0, ^(to) = cos(27rm/n) — 1 and s (to) = a^. 



480 Furthermore, = IImIIk„_<. ~ nc? /A and IImIIk^ — otherwise. From these computations. Theorem 5.4 implies 

481 that 



n—l 



2(cos(27r^/n) - 1) An{cos{2Trm/n) 



482 for large 6. Using the identity X]fe=i (-'^ ~'^*-'^(^'''^/")) ^ ^ ("-^ ~ ^)/^ (see equation 1.381.1 in Gradshteyn and Ryzhik 



2007 's table of integrals and series), this approximation simplifies to 

1 / 2n2c2 1 



(29) x{S) « /2 - a V2 + ( ^, . - ^ (n2 - l)a^ 

4(3n^ \1 — cos(27r£/n) 6 

484 Since x(0) = p. + c — cr^/2, high dispersal is better than no dispersal if x(oo) — x(0) — cr^(l — l/n)/2 — c > 0. 

485 When the number of patches is sufficiently large, this inequality implies that highly dispersive populations grow faster 

486 than sedentary populations provided that the temporal variation is sufficiently greater than the spatial variation in 

487 per-capita growth rates i.e. > 2c. On the other hand, x((5) is decreasing for large S if the coefficient of 1/S is 

488 positive i.e. 

4c2 > -(1 - cos(27r^/n))(l - n-^)a^. 
3 

489 Hence, if £/n is small enough, then xi'^) is decreasing for large 6. 
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490 Biological interpretation of equation (29 I. In a circular habitat with nearest-neighbor dispersal and sinu- 

491 soidally varying patch quality, intermediate dispersal rates maximize the stochastic growth rate provided that spatial 

492 heterogeneity occurs on a short scale (i.e. £/n sufficiently small) and temporal variability is sufficiently large. 



493 Example 5.3 Multi-scale patches. Suppose now that our organism lives in a hierarchically structured habitat. 

494 For example, individuals might live on bushes, the bushes grow around the edges of clearings, and the clearings are 

495 scattered across an archipelago of islands. We label each bush with an ordered triple recording on which island, in 

496 which clearing, and in what bush around the clearing it lives, so that for instance (2, 1, 4) denotes the fourth bush in 

497 the first clearing of the second island. To make the mathematical picture a pretty one, we suppose that each of the 

498 / islands has the same number C of clearings and each clearing has the same number B of bushes. This enables us 

499 identify the habitat structure with the group Z/ <S) Zc <Si '^b, where, as above, Z,„ is the group of integers modulo 

500 TO. We will get particularly simple and interpretable results if we also assume that dispersal rates and environmental 

501 covariances only depend on the scale at which the movement occurs - between bushes, clearings, or islands. 

502 Although it requires imaginative work to find examples with many more scales than this (do the organism's fleas 

503 have fleas?) it does not cost us anything to work in greater generality. Suppose, then, that the patches in the habitat 

504 are labeled with the group G = Gi (E) ■ ■ ■ <E) Gk, where Gj — Z„^ for I < j < k. 

505 Thus, one patch is labeled with the identity element ida = (idi, . . . ,idk) and every other patch is labeled by 

506 the displacement required to get there from idg. The later coordinates are understood to be at finer "scales", 

507 so that if gi = hi for all 1 < i < j — 1, then g and h represent patches in the same metapatch at scale j. For 

508 instance, in our example above, the archipelago of islands is the single metapatch at scale 1 and the metapatches 

509 at scales 2 and 3 are, respectively, the islands and the clearings. We label the metapatches at scale r with the set 

510 Zr := {g <E G : gr = id^, . ■ .gk — idfc}, with the convention that Z^+i := G. Because a label g ~ {gi, . . . ,gk) S G 

511 represents displacement, the coordinate of the leftmost non-identity element of g, denoted by 

£(5) := min{j : gj 7^ id^} and ^(idc) = fc + 1, 

512 tells us the scale on which the motion occurs: g € G corresponds to a displacement that moves between patches 

513 within the same metapatch at scale i(g) but moves from a patch within a metapatch at scale £{g) + 1 to a patch 

514 within some other metapatch at that scale. Note that 1 < £{g) < fc + 1. 

515 We assume that the dispersal rate and the environmental covariance between two patches only depends on the scale 

516 of the displacement necessary to move between the two patches. That is, we suppose there are numbers (71, ... , qk+i 

517 and Si, . . . , Sfe+i such that q{g) = and s{g) — s^^gy 

518 In Appendix[G]we show that the Fourier transforms appearing in Theorem |5 .4| depend on the following quantities. 

519 Let Nr :— j^Z^ = nj'=i be the number of metapatches at scale r. Write := {17 S G : g^ — idj, j < r} for the 

520 subgroup of displacements that move from one patch to another within the same metapatch at scale r + 1 and set 

521 Nr #Zr = rij^r+l ■ 

geZr y heG,. \ zez.r } \ heGr zez,. ) j 

522 We can interpret this quantity as follows. There are f^^ metapatches at scale r. Each one has within it metapatches 

523 at scale r + 1. First, compute the average of /i over all the patches within each metapatch at scale r + 1, then compute 

524 the variance of these averages within each metapatch at scale r, and finally average these variances across all the 

525 metapatches at scale r to produce v^^ir). Thus, Vfj_(r) measures the variability in /i that can be attributed to scale 

526 r + 1 . Set 

fc 

l=r 

527 and 

r 



528 The following result agrees with equation (21 1, which describes the special case where there is a single scale. 
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539 
540 



529 Theorem 5.5. For a habitat with the above multi-scale structure, equation (19) reduces to 

Nr- 



(30) 



X{6) 



1 



1 



1 



27 

as 6 ^ oo. Furthermore, q{r) < for all 1 < r < k. 



Nr 



4^1+1 



Note that if se increases with £ (that is, two patches within the same metapatch have a higher environmental 
covariance than two patches in different metapatches at that scale), then s(r) decreases with r. Also, if increases 
with £ (that is, there is a higher rate for dispersing to a patch within the same metapatch at some scale than to a 
patch in another metapatch at that scale), then q(r) is negative and decreases with r. Using these observations, we 
may read off several things from ( 30 ) . 



First, consider a simple example with a fixed, large number n of patches distributed among a variable number of 
islands. Now k = 2, and let the number of islands ni = 1/a, with a > 1, so that the number of patches on each 

1/a, and Nj, = n, while Nq = n, Ni = an, and iV2 = 1, so (30l reads 

S2) + an{s2 - si))^ w^(2) - {an - l)(s3 - 32)^ ' 



island is 712 
(31) X{S) 



an. In this case, iVi = 1, A^2 



v,il) , (l-a)((s3 



qin 
ail — a){s2 



aqin^ 



an'^{q2a + (Ji(l — a)) 



Sqi 



+ 0{n-^). 



The effect of higher dispersal depends on the difference in covariances between patches on the same island and on 
different islands, and on the number of islands. 



541 Biological interpretation of equation (31). // a sufficiently large number of patches are distributed equally 



across a number of islands, then for a given dispersal pattern, the stochastic growth rate increases with the dispersal 
rate (at high levels of dispersal). This effect is strongest if there are only two islands (i.e. a — 1/2). 

Secondly, imagine a fixed ensemble of patches with varying mean per-capita growth rates and consider the following 
two possibilities for assignment of these patches to metapatches at scale 2 (the islands in our bush-clearing-island 
example). One possibility is that some islands are assigned patches that are primarily of high quality, whereas other 
islands are mostly assigned poor patches. The other possibility is that patches of different quality are evenly spread 
across the islands, with the range of quality within an island similar to the range of quality between islands. In 
the first case, the variance across islands of within-island means is comparable to the variance across all patches, so 
Vi_i{l) ~ v^{k). In the second case, the within-island means are approximately constant, so that w^(l) will be small. 
Therefore, since q{r) is negative for all r, having local positive association of fi at nearby patches leads to higher 
stochastic growth rates, at least for large enough values of the dispersal parameter 6. 



553 Biological interpretation of equation (30). All other things being equal, the species will do better if the good 



habitat is concentrated on particular islands, rather than spread out across many. 

Finally, we can observe that adding new scales of metapatch may change the situation from one in which x(<^) is 
maximal at high values of the dispersal parameter S to one in which x{S) is maximal at intermediate values of S, or 
vice-versa. If ni = 1, then s(l) and fp(l) are both zero, and changing ni (for example, going from one to several 
islands in our example) will increase s(l). Changing ni will also add the quantity — (71(711 — l)A^i to all values of 
q{r). The result of this could be to change the sign of the coefficient of j in (19). 



Biological interpretation of equation (30). The optimal level of dispersal for a subpopulation, and the growth 



rate at that level of dispersal, may differ drastically depending on whether it is connected ( or connectable ) by dispersal 
to other subpopulations. 

6. Discussion 

Classical ecology theory predicts that environmental stochasticity increases extinction risk by reducing the long 



565 term per-capita growth rate of populations May 1975 Turelli 1978 . For sedentary populations in a spatially ho- 



mogeneous yet temporally variable environment, a simple model of their growth is given by the stochastic differential 
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567 equation dZt = ^Ztdt + aZtdBt, where B is a standard Brownian motion. The stochastic growth rate for such 

2 

568 populations equals /i — the reduction in the growth rate is proportional to the infinitesimal variance of the noise. 

569 Here, we show that a similar expression describes the growth of populations dispersing in spatially and temporally 

570 heterogeneous environments. More specifically, if average per-capita growth rate in patch i is fii and the infinitesimal 

571 spatial covariance between environmental noise in patches i and j is aij, then the stochastic growth rate equals the 

572 average of the mean per-capita growth rate (fj) = /iiE[yj^] experienced by the population when the proportions 

573 of the population in the various patches have reached equilibrium minus half of the average temporal variation 

574 (cr^) — E[X]j j '^ij^oo^ool experienced by the population in equilibrium. The law of Yoo, the random equilibrium 

575 spatial distribution of the population which provides the weights in these averages, is determined by interactions 

576 between spatial heterogeneity in mean per-capita growth rates, the infinitesimal spatial covariances of the environ- 

577 mental noise, and population movement patterns. To investigate how these interactions effect the stochastic growth 
rate, we derived analytic expressions for the law of Yoo, determined what choice of dispersal mechanisms resulted in 
optimal stochastic growth rates for a freely dispersing population, and considered the consequences on the stochastic 
growth rate of limiting the population to a fixed dispersal mechanism. As we now discuss, these analytic results 
provide fundamental insights into "ideal free" movement in the face of uncertainty, the persistence of coupled sink 
populations, the evolution of dispersal rates, and the single large or several small (SLOSS) debate in conservation 
biology. 

In spatially heterogeneous environments, "ideal free" individuals disperse to the patch or patches that maximize 
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600 
601 
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603 
604 



610 
611 
612 
613 
614 



th^irkxng term per-capita growth rate Fretwell and Lucas 



Sabelis 1999 Schreiber et al. 2000 Schreiber and Vejdani 



1970 



2006 



Harper 1982 Oksanen et al. 1995 van Baalen and 



Kirkland et al. 2006 Cantrell et al. 2007 . In the 



absence of environmental stochasticity and density-dependent feedbacks, ideal free dispersers only select the patches 
supporting the highest per-capita growth rate. Here, we show that uncertainty due to environmental stochasticity 
can overturn this prediction. Provided environmental stochasticity is sufficiently strong and spatial correlations 
are sufficiently weak, equation (16) implies that ideal free dispersers occupy all patches despite spatial variation 
in the local stochastic growth rates fii — <jf/2. Intuitively, by spending time in multiple patches, including those 
that in isolation support lower stochastic growth rates, individuals reduce the net environmental variation (cr^) they 
experience and, thereby, increase their stochastic growth rate. Hence, dispersing to lower quality patches is a form of 
bet-hedging against environmental uncertainty 



Slatkin 1974 Philippi and Seger 1989 Wilbur and Rudolf 2006 



When environmental fluctuations in higher quality patches are sufficiently strong, this spatial bet-hedging can result 
in ideal free dispersers occupying sink patches; patches that are unable in the absence of immigration to sustain a 
population. This latter prediction is consistent with Holt's analysis of a discrete-time two patch model Holt 1997 . 
Spatial correlations in environmental fluctuations, however, can disrupt spatial bet-hedging. Movement between 
patches exhibiting strongly covarying environmental fluctuations has little effect on the net environmental variation 
(cr^) experienced by individuals and, therefore, movement to lower quality patches may confer little or no advantage to 
individuals. Indeed, when the spatial covariation is sufficiently strong, ideal free dispersers only occupy patches with 
the highest local stochastic growth rates fii — af/2, similar to the case of deterministic environments. In deterministic 
environments, density dependent feedbacks can result in ideal-free dispersers occupying multiple patches including 
sink patches Fretwell and Lucas 1970 Cantrell et al. 2007 Holt and McPeek 1996 . Our results show that even 



density-independent processes can result in populations occupying multiple patches. However, both of these processes 
are likely to play important roles in the evolution of patch selection. 

A sink population is a local population that is sustained by immigration Holt 1985 Pulliam 1988 Dias 
Removing immigration results in a steady decline to extinction. In contrast, source populations persist 
absence of immigration 



in the 

Empirical studies have shown that landscapes often partition into mosaics of source and 

For discrete-time two-patch models, 
coupled by dispersal can persist, a 

and 



sink populations Murphy 2001 


Kreuzer and Huntly 


2003 


Keagy et al. 2005 . 


Jansen and Yoshimura 


1998 si 


aowed, quite surprisingly, that sink populations 



prediction supported by recent empirical studies with protozoan populations Matthews and Gonzalez 2007 



extended to discrete-time multi-patch models Roy et al. 2005 Schreiber 2010 . Here, we show a similar phenomena 



occurs for populations experiencing continuous temporal fluctuations. For example, if the stochastic growth rates in 

615 all patches equal n — 12 and the spatial correlation between patches is p, then equations ([5| and (181 imply that 

616 populations dispersing freely between n patches persist whenever /x — ((n — \)p -I- l)(T^/2n > 0. Hence, ideal free 

617 movement mediates persistence whenever local environmental fluctuations produce sink populations (i.e., (T^/2 > 

618 /i > 0), environmental fluctuations aren't fully spatially correlated (i.e. p < 2/i/cr^) and there are sufficiently many 

619 patches (i.e., n > ((1 — p)(T^)/(2/i — po'^)). This latter expression for the necessary number of patches to mediate 
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persistence is an exact, continuous time counterpart to an approximation by Bascompte et al. 2002 for discrete time 
models. When two patches are sufhcient to mediate persistence, equation ([9) reveals that there is a critical dispersal 
rate below which the population is extinction prone and above which it persists. Our high dispersal approximation 
(see equation (21 1 with Var[/i] = 0) suggests this dispersal threshold also exists for an arbitrary number of patches. 

While ideal free movement corresponds to the optimal dispersal strategy for species without any constraints on 
their movement or their ability to collect information, many organisms experience these constraints. For instance, 
in the absence of information about environmental conditions in other patches, individuals may move randomly 
between patches, in which case the rate of movement (rather than the pattern of movement) is subject to natural 



selection Hastings 


1983 


Levin et al. 


1984 


Hutson et al. 




2001 




Kirkland et al. 


2006 



When density-dependent feedbacks are weak and certain symmetry 



assumptions are met, our high dispersal approximation in (20) implies there is selection for higher dispersal rates 
whenever 



E 



1 



1 ^ 

4n 



ri-l 



(32) 

where 
and 

asserts that if temporal variation (averaged in the appropriate manner) exceeds spatial variation, then there is 



k = l ' ' k=l 

recall, X^. < 0, a-re the eigenvalues/ vectors of the dispersal matrix, /i is the vector of per-capita growth rates, 
fc are the eigenvalues of the covariance matrix for the environmental noise. Roughly speaking, equation ( 32 ) 



selection for faster dispersers; a prediction consistent with the general consensus of earlier studies Levin et al. 1984 



McPeek and Holt 1992 Hutson et al. 2001 . More specifically, in the highly symmetric case where the temporal 



variation in all patches equals and the spatial correlation between patches is p, equation ( 32 ) simplifies to 

(l-p)a2 



(33) 



> 



VVarM, 



2 ' y^T^ 

in which case lower spatial correlations and larger number of patches also facilitate selection for faster dispersers. 
Another important constraint influencing the evolution of dispersal are travel costs that reduce fitness of dispersing 



individuals. While the effect of these costs have been investigated for deterministic models DeAngelis et al. 2011 
it remains to be seen how these traveling costs interact with environmental stochasticity in determining optimal 
dispersal strategies. 

Previous studies have shown that spatial heterogeneity in per-capita growth rates increases the net population 

Intu- 



2009 



growth rate for deterministic models with diffusive movement Adler 1992 Schreiber and Lloyd-Smith 
itively, spatial heterogeneity provides patches with higher per-capita growth rates that boost the population growth 
rate, a boost that gets diluted at higher dispersal rates. Our high dispersal approximation (l20|) shows that this boost 



also occurs in temporally heterogeneous environments, i.e. the correction term — X]fc=i X~(^fe Z^)^/*^ positive. More 



importantly, the multiscale version of this correction term ( 30 1 implies this boost is larger when the variation in the 
per-capita growth rates occurs at multiple spatial scales. For example, for insects living on plants in meadows on 
islands, the largest boost occurs when the higher quality plants (i.e. the plants supporting the largest fii values) occur 
on the same island in the same meadow. This analytic conclusion is consistent with numerical simulations showing 
that habitat fragmentation (e.g. distributing high quality plants more evenly across islands and meadows) increases 
extinction risk Fahrig 1997 2002 . Intuitively, spatial aggregation of higher quality patches increases the chance of 
individuals dispersing away from a high quality patch arriving in another high quality patch. Even without spatial 



variation in per-capita growth rates, equation (301 implies that strong spatial aggregation of patches maximizes 
stochastic growth rates for dispersive populations living in environments where temporal correlations decrease with 
spatial scale. This finding promotes the view that a single large (SL) reserve is a better for conservation than several 



and Murphy, 1985 


Gilpin 


1988 


Cantrell and Cosner 


1989 


1991). For example, using reaction-diffusion equations. 


Cantrell and Cosner 1991 


found that even in deterministic environments "[it] is better for a population to have a 



few large regions of favorable habitat than a great many small ones closely intermingled with unfavorable regions." 

that, unlike ours. 



1987 



However, our results run contrary to a numerical simulation study of Quinn and Hastings 
applies to sedentary populations experiencing independent environments Gilpin 1988 . 

While our work provides a diversity of analytical insights into the interactive effects of temporal variability, 
spatial heterogeneity, and movement on long-term population growth, many challenges remain. Most notably, are 
there analytic approximations for relatively sedentary populations? What effect do correlations in the temporal 
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667 fluctuations have on the stochastic growth rate? Can the exphcit formulas for stochastic growth rates in two patch 

668 environments be extended to special classes of higher dimensional models? Can one extend the analysis to account for 

669 density-dependent feedbacks? Answers to these questions are likely to provide important insights into the evolution 

670 of dispersal and metapopulation persistence. 
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814 Appendix A. Proof of Proposition 13.11 

815 Define the matrix R by 

R := diag(/x) + D. 

816 Equation ([3| becomes 

dXt = diag(Xf)r^c/Bt -I- R^Xtdt. 
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817 Recall that F/ Xl/{Xl^ hX^") for each 1 < j < n and Yt = (rj\ . . . , F^")^. Fix j and define fj{xi, x„) 

818 Xj/{xi + • • • + Xn), SO that — fjCX.). Using dk to denote differentiation with respect to Xk, observe that 

djfj{xi,...,Xn) ^ ^Y^xej I {^2j'^^ ' dkfj{xi,...,Xn) ^ -Xj I {^Yl^^ ' '^T^-?- 



819 Moreover, 



djjfjixi, ...,Xn)^-2\J2^e]/\J2^n ■ 



2 



djkfjixi, ...,x. 



and 



n)^-l^ (^x}j +2Xjj/(^x}j ,k^j 

dkmfjixi,...,Xn) = 2xj ^ (^^Xi^ ,k,m^j. 



822 It follows from Ito's lemma Gardiner[ |2004 that for each 1 < j < n, 



n 

+ (I/2) Y dkmfA'^t)X^Xr{^)kmdt, 



k,m—l 

823 where T^k and R^k denote the k*'^ columns of the matrices F and R respectively. Substituting in the derivatives of 

824 fj gives 

dY^ =-Y Yt'Y.'r^kdBt + Y Yt'YM.dB, 

k^j k^j 

- Y y^y^R*kdt + Y Yt%^R,jdt 

k^j k=jtj 

+ (1/2) Y ^Y/Yl'Yr^kmdt- (1/2) Y 2Yt\Yi'fi:,,dt 

k.m^i k^j 



(1/2) X 2 ^ ( - Y^Y,^ + 2Y,\Y^f) l^kjdt 



k^i 

= - Yl J2 YMkdBt + Y/r^^dBt - r/ J2 ^lR*kdt + YjR^.dt 

k k 

+ r/ Y YtYr^kmdt - Yl Y Yt^kjdt. 

k,m k 

825 Since Dl — 0, we have R^k = = diag(/i)l = fi, and the above system of SDEs can be written in the following 

826 compact way 

dYt = - YtYfr^dBt + diag(Yt)F^dBt 

- YtYf^ldt + R^Ytdt + YtYjY.Ytdt - Amg{Y t)Y.Y tdt 
= (diag(Yt) ~ Y,Yf ) T^dBt + D^Ytdt 
+ (diag(Yt) - YtYf) {^l - SYt) dt. 

827 Now that the SDE Q is established, we will prove the ergodicity of the Markov process (Yt)t>o defined in Q. 

828 Existence. Clearly (Yf)t>o is a Feller process. Since for each i > 0, the random vector Yt takes values in the 

829 compact state space A, it trivially follows that the family of probability measures {P^{Yt e •} : i > 0} is uniformly 

830 tight for any fixed y € A, where denotes the law of the process with Yq = y. Hence, by the Krylov-Bogolyubov 
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831 theorem (see, for example, Da Prato and Zabczyk 1996 Corollary 3.1.2]), there exists at least one probability 

832 mcasm'e /i on A which is an invariant measure for the process (Yt)t>o, that is. 



833 
834 

835 
836 
837 

838 
839 

840 
841 
842 
343 
844 



846 
847 



851 
852 
853 
854 
855 
856 
857 



858 
859 
860 



Uniqueness. The uniqueness of the invariant measure for (Yt)j>o is ensured by the Doob-Khasminskii theorem (see, 
for example, [Da Prato and Zabczyk 1996 Chapter 7] ), provided this process satisfies the following two properties: 

• (Yt)t>o is irreducible, that is, P^{Yt £ V} > for any t > and any open set V in the simplex A. 

• (Yt)t>o is strong Feller, that is, A 9 y i-> J^py{Yt G dz}f{z) is continuous for any bounded measurable 
function / : A ^ M. 

These conditions also ensure that (Yf)t>o converges in law to the unique invariant measure. We next establish 
irreducibility and the strong Feller property of (Yt)t>o separately. 

(a) Irreducibility. It clearly suffices to show that the process (Xt)i>o as defined by ([3| is irreducible, that is, that 
P"^{Xt eU}>0 for each i > 0, a; € Mlf. \ {0} and open set f7 C M'J:. ~ 

We will first prove that P^{Xt* > Vi} = 1 for all t > and all a; € \ {0}, by induction on the size of the set 
Z? := |1 < i < n : Xi = 0}. First consider the case = 0. By a suitable comparison theorem for SDKs Geii3 and 



M ant hey 



1994 



Theorem 1.1], P^{Xt > Xt for alH > 0} = 1, where X is defined by 
dXl = ^i.Xldt + XldEl + DuXldt, l<i<n. 
845 This SDE has the unique solution XI — exp{El + {ji + Da — > 0, so 



(34) 



r-'iXl > Vi for ah i > 0}, xe (0, oo)" 



Now suppose = k < n. By the irreducibility of the infinitesimal generator matrix D, there exist io € G,jo ^ G 



such that Djg i^ > 0. Consider the new SDE 



dXl = n.Xldt - 



XldEl 



DuXldt, 



and 



dXl- = 



ti,,Xl°dt 



Xl^dEl 



{Djo^oXf + D,,,,Xl")dt. 



By the same comparison theorem, P^{Xt > Xt for all t > 0} = 1. Clearly, V^{Xl > 0} = 1 for alH ^ G and for ah 
t > 0. Since Xq° = and Xq° > 0, at time t ^ the diffusion component of Xl" vanishes but its drift coefficient 
is strictly positive. It follows that P^jX^" > 0} = 1 for all t > 0. Hence, at any positive time t, almost surely X^ 
has at most fc — 1 zero coordinates, and, by the comparison theorem, so does X(. Using the Markov property and 
the induction hypothesis, we deduce that V^{Xl > Vi} = 1 for all t > 0. This proves that each component of X is 
strictly positive with probability 1 for each t > 0. 

Let (fi : (0,oo)" M" be the homeomorphism given by (p{x) = (logxi, . . . , loga;„). Set = (^(Xt), with 
Ht = {H^ , . . . , HJ^)^ . By (34), this stochastic process is well defined provided Xq G (0;Oo)". Note that (Ht)t>o 
satisfies the following SDE, 



dHl ^ i^i, - lEu)dt + dEi + e 



1 < i < n. 



By Girsanov's theorem (see Ikeda and Watanabe 1989 Section 4 of Chapter IV]), the law of (F-'") ^Hi (and hence 
the law of Hj) is absolutely continuous with respect to the law of Bt for any t > 0. Thus, P^{H( G > for any 
open set F C M". Finally, for any a; G M" \ {0}, 



-^{X, Gt/} = / P-{X,/2 G dy}PnX,/2 G [/} 

{X,/2 e dy}PnXt/2 e U} 



(0,oo)" 



(0,oo)" 



{Xw2 e dy}P^(^){Hw2 e viU)} > 0. 
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861 (b) Strong Feller property. Note that H satisfies a SDE of the form dHt = T^dBt + b{llt)dt for some smooth 

862 function 6 : M" — > M". For each K > 1, consider a new SDE 

rfHf = T^dBt + b^{Tit)dt, 

863 where : M" M" is a smooth bounded function with bounded derivative such that b^ (x) = b{x) on [—if, if]". 

864 Since the matrix F is nonsingular, the associated Fisk-Stratonovich type gen erator of (Hf")t>o is tr i vially hypoelliptic, 

Section 8 of 



Ikeda and Watanabe 



1989 



X 



865 which in turn implies that (Hf^)(>o is strong Feller for every K > 1 (see 

866 Chapter V]). If we define a sequence of stopping times tk := inf{t : ||^t||oo > K}, then H" = Hq 

867 implies — Hf for t E [0,tk]. Let t > and / be a bounded measurable function. Fix e > 0. Then for any 

868 X e M", 

|E-[/(H,)] -E-[/(Hf )]| < 2||/||^P-{r;^ < t}. 

869 Hence, for any open neighborhood U{x) of x, 

|E^[/(Ht)] -E"[/(Ht)]| < |E^[/(Hf )] -E-[/(Hf )]| +4||/||^ sup P^tk < t} for all y e U(x). 

z<£U{x) 

870 Since almost surely tk t OO; we can choose K large enough such that F^{tk < t} < e(8||/||oo)^^- More- 

871 over, by the Feller property of (Ht)t>o, there exists a neighborhood U^{x) of x such that sup^g^ji^j.) P^{Tii- < 

872 t} < e(8||/||oo)^^- From the strong Feller property of (Hf^)t>o, there exists a neighborhood U'^{x) of x such that 

873 |E2'[/(Hf )] - E^[/(Hf )]| < e/2 for ah y € W^ix). Thus, |E'^[/(Ht)] - E^[/(Ht)]| < e for aU y G U^x) D U^{x). 

874 Hence, x i— >■ E^[/(Ht)] is continuous. Now, for t > and a bounded measurable function g : K" — )■ M, 



E^[.9(Xt)] - 



(0,oo)" 



{X,/2Gdy}E^(y)[g(^-i(Hw2))], xe 



875 Therefore, the map x i— >■ E^[(7(Xt)] is continuous, and so (Xt)f>o is a strong Feller process. It follows easily that 

876 (Y()f>o is also a strong Feller process. □ 



877 Appendix B. Proof of Proposition 14.11 

878 By rescaling time r := St and setting e := 1/6, Q becomes 

(35) dV; = yi/(Y^)dB, + eg{Yl)dt + Q^Y%dt 

879 where f{y) := (diag(y) - yy'^) P'^, g{y) := (diag(y) - yy'^) {fi - Sy), and := Y^/^. 

880 For e > 0, let be the unique invariant probability measure for (35) guaranteed by Proposition 3.1 The 

881 irreducibility of Q implies that tt is the unique stable point for the ODE 

d . 



dr 



y% = Q'y%, 



X G A, 



882 and that lim^_>.oo Vt — for ^-i^Y G A. Write i^o for the Dirac measure at the point tt G A. By the compactness of 

883 Borel probability measures on A in the topology of weak convergence, it suffices to show if v^,. converges weakly to 

884 V for some sequence \, 0, then v — vq, and hence it is sufficient to check that 

h{y^)iy{dx)^ / h{x)v{dx) 
A J A 

885 for every r > and Lipschitz function ft, : A — >■ E. 

886 Set Y^' = Y^*" and Vk = J^e^ for ease of notation. Let L be the Lipschitz constant for the function h. Then, 



(ft(y^) - h{x)) y{dx) 



lim 

k—^oo 



< lim sup 



ih{y^) - h{x)) Vk{dx) 

(E- [HY';)] -h{x)) Vk{dx) 



=0 by invariance of Vk 
W [ft(y^) - ft(Y^)] Mdx) 

< limsupi / E" Y^ll] i^kidx), 

k^oo J A 



- lim sup 

k—^oo 
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887 where || • |1 is the usual Euchdean norm on M". 

888 It remains to show that hmfe__j.oo sup^.^^ [\\y^ — Y^||] = 0. Fix a; G A and set Z!^ :— — Y^'. By Ito's formula, 



[||Z 



fc||21 



E 



< 2||0 



r2(Z^Q^Z^) -2efc(Z^g(Y,^)) +6fcTr(/(Y,^)/(Y,^f )d5 
Jo 

E-[||Z^^||2] ds + ekCT, 



889 for some constant C that does not depend on x or t, where we write 

890 M", and = supy^n^i Q^z)|. Gronwall's inequality implies that 

891 and so, by Jensen's inequality, 

E- [IIZ^II] < v^e"«""^ 

892 It follows that limfc^.oo sup^.^^ E^ [||?/^ — Y^'||] = 0, and hence = vq, as required. 

893 In particular. 



for the usual Euclidean inner product on 



T 
fj, TT 



M yvi/s{dy) 
1 



y ^y 1^1/ sidy) 



as 5 ^ oo. 



□ 



895 Appendix C. Proof of Proposition 15.11 

896 Fix 5 e [OjCxo), and denote our underlying probability space by {VL,F,V). Define 



s,t 



X Q 



< s < t 



897 by $f t(x, = Xj (a;), where (X^)„>5 is the unique solution of 



diag(Xf)r^dB 



898 with Rs := diag(^) + 5Q. 

899 Note that for all < s < w < 

(36) $,,t(-,^)-$t,t(-,c.)o$f^^(.,c.). 

900 It is easy to see that $f j(-,a;) is a linear map from M" to M" and thus can be represented by a matrix ((w). 

901 From (361, it follows that 

= Mt,i(c.)Mf,„(^) for aU < s < ^i; < t. 

902 Since Mf ^ is constructed from (B„ — Bs)„g[5 jj, the matrices {M^ fc+i}fceN are independent. Moreover, since the 

903 drift and the diffusion coefficients do not depend on time, {M^. is a stationary sequence. 

904 We note that the Lyapunov exponent x(<^) of (^t )t>o is the same as 

Hm E[A;-ilog||M^,,.||] = inf E[fc-ilog||Mg,,||] , 



905 where we set 



j4|| sup < Ajj Xj : — 1, x^ > OVfc 



2 J 



906 for a matrix A with nonnegative entries. 



Set 



:={xe 



(37) 



X > 0}. If (5 > 0, then it follows from the irreducibility of Q that 
Mf t(M'^) C {x e M" : > for ahl < i < n} U {0} 



908 and hence x((5) is analytic on (0,oo) by Ruelle, 1979 Theorem 3.1]. 

909 The condition (37) fails to hold when 5 = and so we must proceed differently. We first claim that for fixed 

910 t > the map S t-^ t^^E[log ||Mq ^ ||] is upper semicontinuous on [0, oo). To see this, fix (5 e [0,oo). Set log"*" a; = 
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911 max(0, logx ) and log~ a: = i nin(0, log a:). It follows from the continuous dependence of the solution of a SDE on its 

912 parameters Gardiner 2004 4.3.2], that Xf almost surely as 5' S, which implies that ||Mo't|i -J> HMg J 

913 almost surely as S' — ^ 6. An application of Gronwall's lemma gives that E[supo<5<c ll^t II] < for each c > 0. 

914 Hence, 



E 



log+llMi 



o,t\ 



E[log+|lM^,J] as 5' 



915 On the other hand, by Fatou's lemma, 

E[-log-|lM^,^ 

916 Combining these two inequalities gives 



< liminf E 



log-||M^' I 



limsupE log||Mgt| 



< 



E[log||M^,, 



917 and the claim follows. 

918 Since xi^) — inft>o i^^Elog ||Mq is the infimum of a family of upper semicontinuous functions, it is itself upper 

919 semicontinuous, or equivalently, limsup^/_j.^ xC*^') ^ xi^)- particular, limsuP|5_j.Q < x(0). 

920 We now prove the opposite inequality that liminf^_j.o xCi^) ^ x(0)- Fix S > 0, and without loss of generality 

921 suppose that max; —Qu = 1, so that if Xi > Zi > for 1 < i < n, then {Qx)i > — for \ < i < n. Consider the two 

922 SDEs 

dX.1 = diag(Xf )r^(iBt + (diag(M) + SQ'^)Xfdt 



and 



dZf = diag(Z^)r^dBt + diag(Ai - S)Zfdt. 



924 If Xq = Zq, then, by the comparison theorem, 

X^ > Zf for alH > 

925 almost surely. 

926 Thus, the Lyapunov exponent of (X^ )f>o dominates that of (Z^ )j>o. Note that the coordinates of Z^ are decoupled 

927 and hence the Lyapunov exponent of this process is the maximum of the stochastic growth rates for the individual 

928 coordinate processes. Therefore, 



929 In particular, 
(38) 

930 as required. 



X{S) > max (^Hj ^ ^ ^ ^^^'^ ~ 
liminf xiS) > max (^fi^ - ^ ^ a^^ 



X(0), 



□ 



Appendix D. Proof of Theorem 15.2 



932 Recall that 

dYt = (diag(Yi) - YtYf) T^dBt + D^Ytdt + (diag(YO - YtYf) {n - SY^) dt, 

933 where D is of the form 6Q, with Q an irreducible infinitesimal generator matrix and 6 > 0. Moreover, Q is assumed 

934 to be reversible with respect to the unique probability vector tt satisfying Q^tt = 0; that is, that TiiQij — T^jQji for 

935 all 

936 Define an inner product on M" by {u,v)^ := ^UiVi = U"^diag(7r)^^i;. It follows from reversibility that the 

937 linear operator v ^-^ Q'^v is self-adjoint with respect to this inner product; that is, that {u,Q'^v)-^ — {Q'^u,v)^ for 

938 all u, V. 

939 From the spectral theorem and the Perron-Frobenius theorem, the linear operator v i— >■ Q"^v has eigenvalues 

940 Ai < A2 < • ■ • < A„_i < A„ = and corresponding orthonormal eigenvectors Ci, • . • , with ^„ = such that 

n-l 



k=l 
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941 Note that 
(39) 

942 where k := — A„_i > and || • ||^ is the norm associated with the inner product (•, •)7r- 

943 Note also that if l^v = 0, then 

n-l 

fc=i 

944 is the unique vector with the properties 

{w, 7r)Tr = and Q^w = v. 

945 In particular, 

1^ (diag(7r) - nn^) {fi - Ett) = (tt^ - n^) - Stt) = 0, 

946 and so there is a unique vector we denote v such that 

(40) l^iy = tt)^ = and Q^i^ = - (diag(7r) - tttt^) (^ - Evr) . 

947 We emphasize that v does not depend on 6. 

948 Consider the stochastic process 

\Jt := 55 {Yt/s -TT- S-'iy) , 

949 so that 

950 Observe that tt + 5~^i' is indeed a probability vector for 6 sufficiently large. Because we are only interested in the 

951 equilibrium law of the process Y, we assume that Yq — Tr + 5~^v and hence Uq ~ 0. Note that = l^\5t = (Uj, tt)^ 

952 for allt > 0. 

953 We have for the standard Brownian motion Bj :~ JsBj/^ that 

dVt = (diag(5-5Ut + 7r + (5-V) - {6~^\Jt + + 5^^v){6-^\5t + + S'^vf'^ dBt 
+ S-^Q'^iS-^Vt + TT + ,5-^) dt 

+ S-^ (diag((5-5Uf +TT + S-^i^) - ((5-5Uf + tt + 6-^iy){6-^Vt + tt + S-^i'f 
X (^^ - I]((5-5Ut + TT + S-^iy)j dt. 

954 Using Q-^n = and pOl), we get 



dVt = [diag(7r) - TTTT^] dBt + Q^Ut dt 



dB, 



955 where 



956 
957 



and 



Ai (u) :— [diag(u) — un'^ — ttw"^] 
Ai{u) := [-uu^ + diag(i^) - -kv^ - v'k^\ 
Az{u) := {-uv^ - vu^\ 
A2{u) -vv^V^ 

h\{u) :— —TTU^ fi — un'^ jj, + nu^'Sn + un'^'Sn + nn'^TiU 
+ diag(w)/^ — diag(7r)Su — diag(w)S7r 



63 (u) 



T T T 



diag(7r)Ei' — diag(u)Eu + diag(t^)/^ — diag(i^)E7r 
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963 
964 



965 By Ito's lemma, 



b2{u) := —uiy^ II — vv^ 

— diag(u)Si/ — diag(i^)Su 
6| (u) := —vv'^^i + uu^Hu + uv^Hu + vn^Hu + -ku^YjU + z/Tr'^Sz/ + z/z/^Ett 
— diag(i^)Ez^ 

63 (u) := uz/'^E!/ + fu-^Ez/ + vu^'Eu 
bz (u) := i/i/^Ez/. 

rf||Ut||2 = 2Uf diag(7r)-i [diag(7r) - tttt^] rfB^ + 2(Ut, Q^Uj)^ dt 

4 

+ 2 ^ ,5-5 U^diag(7r)-i Ai (U*) dBt 
7 

+ 2 ^ ,5-5Uf diag(7r)-i6| (U() dt 
1=1 

+ Tr (^diag(7r)-^ [diag(7r) - tttt"^] T'^F [diag(7r) - tttt^] )dt 
+ Tr (diag(7r)-i^(5-5A|(Ut) x ^<5-5A|(Ut)^ ) dt. 

966 Note also that 

(41) \U\\ < CTs \<i<n, 

967 for an appropriate constant C because < y/ < 1, 1 < i < n. Each function 

u u'^diag(7r)"^6|(u), 2 < f < 7, 

968 is a polynomial in u with total degree at most I, and each function 

u H- Tr (diag(7r)-^ {u)Ai^ {uf^ , 1 < /, £" < 4, 

969 is a polynomial in u with total degree at most £' + £". 

970 It follows that 

(42) |e[||U,||2] <-2/.E[||U,||^]+C" 

971 for alH > for a suitable constant C that does not depend on 6. Hence, 

(43) supE[||U,||2] <g 

972 (recall that Uo = 0). 

973 Let (Vt)t>o be the solution of the stochastic differential equation 

dVt = [diag(7r) - tttt^] dBt + Q'^ Yi dt 

974 with Vo = Uo = 0. Note that d(l'^Vt) = for all t > 0, and so (Vt,7r)„ = I'^Vt = for all i > 0. It is readily 

975 checked that 

Vt = / exp(Q^(t - s)) [diag(7r) - tttt^] dB,. 
Jo 

976 So V is a Gaussian process for which E[Vt] = and 



(44) 



E[VtVf ] = / exp(Q^s) (diag(7r) - tttt^) E (diag(7r) - tttt^) exp(Qs) ds 
Jo 
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977 for alH > 0. Consequently, 
(45) 

978 for 1 < i < n and p > 0. 

979 In the notation above, 

d{lJt - V 



supE [\V;\P] < oo 
t>o 



dBt + 



.1=2 



dt. 



980 Applying Ito's lemma and a combination of (41 1, (43) and (45), we can argue along the lines we followed to establish 

981 (42 1 to see that 

|e [||U, - V,|l^] < ^2nV. [||U, - v.n^] + r ^c" 

982 for alH > for a suitable constant C" that does not depend on 5. Hence, 

(46) 



supE[||Ui-V,||2] <5-i^. 



983 Now let Yoo, Uoo and Vqo be random vectors that are distributed according to the equilibrium laws of (Yt)t>o, 

984 (Ut)t>o and (Vt)t>o, respectively. Also let t/* and be the i-th component of the vectors Uoo and Vqo respectively. 

985 From (41), (43) and the linearity of the function 6i, 

= Q^E[Uo,] + <5-i6i(E[Uoo]) + 0(r i). 



986 Noting that (E[Uoo], 71"),^ = because (Ut, 7r)7r = for all f > 0, we have from (39) that 

k||E[Uoo]||^ < -(E[Uoo],g^E[Uoo]). 

= ri(E[Uoo],6i(E[Uoo])), + 0(ri) 
<C"'5-'\\n^oo]\\l + 0{5-i) 

987 for a suitable constant C , and hence, 

(47) E[C/*] = 0(^3), l<i<n. 



988 From (|43f, (|45| and (146|), 

(48) 



E 



-E 



989 Recall that x{5) is the Lyapunov exponent, and that 

1, 



x(<5)=/i^E [Yoo] - [Y^^Yoo] 



-^E 



<^ ^Uoo + TT + (5 V 



2TT 



- r i^E [U^SUoo] - 25-^E [U^] E (tt + S'h 



'7r + (5-V)^E(7r + 5"V) 



990 Substituting in ( 47 ) and ( 48 ) , and noting from ( 44 ) that the random vector Voo is Gaussian with mean vector and 



991 covariance matrix 



POO 

/ exp((5"^s) (diag(7r) — tttt^) S (diag(7r) — tttt"^) exp(Qs) ds, 
Jq 
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992 we conclude that 



xis) 



ri^TT tt'^Stt 

^ 2 



(/i-E^fi.--Tr(E[Vo,V^]E) 



+ o(rt) 



1 



n TT TT ZjTT 

^ 2 



1 f°° 

- - Tr(exp(Q^s) (diag(7r) - tttt^) E (diag(7r) - tttt^) exp(gs)I]) ds 

2 Jo 

+ o(rt) 



as 5 ^ oo. 



□ 



Appendix E. Proof of Corollary 1^3 



995 We now assume that the matrices Q and S are both real symmetric (S is, of course, always symmetric) and that 

996 they commute. Hence, as noted in the statement of the corollary, if Ai < . . . < A„_i < A„ = are the eigenvalues of 

997 Q with corresponding orthonormal eigenvectors ^i, . . . where ^„ = then 



998 and it is possible to write the eigenvalues 9i, . . . ,0n of S in some order so that 



fc=i 



999 By the assumption that Q is symmetric, n — ^1 ~ ^Cn- Therefore, 



^ 2 ' 2n " 

1000 where ft = ^J^il^'i- 

1001 To find the unique vector ly that solves 

1^1/ = and Q'^v = — (diag(7r) — ttt:'^) {fi — Ett) , 

1002 write ly — X]fe=i ^fcCfc- The condition l^v = dictates that a„ = 0. The second condition becomes 

n-l ^ , ^ 
V a^AfcCfc = {I- A* -^Qnir 



^ n— 1 
n ^ — ^ 



n 
fc=i 



33 



1003 SO that ak = ^{Sj fJ-) /{nXk) for 1 < fc < n — 1. It follows that 



/ 1 \T /"-I cT 



nXk 



At 



E 

k=l 



nAk 



1004 Lastly, the matrices inside the trace in the integral 

/•oo 

/ Tr (exp(Q^s) (diag( tt) — TTTT ) S (diag(7r) — tttt ) exp(Qs)E) ds 
Jo 

1005 commute and so the integral is 

/"OO 

J Tr ((diag(7r) - rnr^Y exp(2Qs)j ds 

= ;^/"Tr|^(/-6.e) (^E^'CfeCj) (j2eM2sXk)^k^iyj ds 
= ( E exp(2sAfc)6Cn 



n— 1 

-4e 

fc=i 



1006 Therefore, our asymptotic approximation of is 



1 ^ 

2n 



.fc=i 



0{S 



-5/4X 



as (5 -> 0. 



□ 



Appendix F. Proof of Theorem 15.4 



1009 To show that Theorem |5.4| follows from Corollary |5.3[ we show that the matrix entries of each irreducible repre- 
loic sentation belong to a common eigenspace of Q and S. Suppose that c is a class function and the matrix C is given 
1011 by Cg,h = c{gh~^). Recall from (27) that 



(5) = ^ E ^('^)^C9)' 



1012 Therefore, 



C9,h = ^ E ^i^M9h-'r 



KEG 

1013 If K is associated with the irreducible representation p £ G, then 

dp 

K{gh-') = Tiipigh-')) = Tr(p(g)p(/i)t) = g p.,(.9)p.,(/i)* 

1014 where f denotes the Hermitian conjugate of a matrix. Set '■— {df^/^G)'E.{K). The #G x #6* matrix is 

1015 Hermitian, and it follows from (24 1 that 11^ = 11^, so that n„ is the projection onto a d^-dimensional subspace. 

1016 Again by (24), the matrices H^' and 11^" are orthogonal for distinct k', k," . Thus, 

neG 

34 



1017 This expression is nothing other than the spectral decomposition of the matrix C. It shows that c{K)/df^ is an 

1018 eigenvalue of C with multiplicity d\. In summary, for each k G G there are eigenvalues q{K)/di^ of Q and s(n)/dK of 

1019 S, each with multiplicity d^. 

1020 Therefore, in the notation of Corollary |5.3[ 



1021 Similarly, we can split the sum 



k=i 

1022 up into contributions from each non-trivial character k that are of the form 

q{K) ^ 

where the sum is over the indices that correspond to eigenvectors in the range of the projection 11^. By pairwise 
orthogonality of the matrices 11^ and the fact the /i is real, this last quantity is equal to 



g.heG 



q{K) 

1023 by definition of D 



1024 Appendix G. Proof of Theorem 15.51 

1025 We first recall some notation. For 0<r, ^<A; + 1, 

Zr ^ Gi® ■ ■ ■ ® Gr-1 <E) {idr} (g) • • • «) {idfc}, 

1026 

Zi = {idi} (g)---(g) {idj (g) Gi+i (g) • • • «) Gfc 

1027 and 

e{g) := min{j : ^ idj}. 

1028 The displacement associated with g £ G moves between two patches that are in the same metapatch at scale £{g) 

1029 but different metapatches at scales £{g) + l,£{g) + 2, . . . Recall also that #Gr = rir, N,. = ~ TVj^i "-j 

1030 Ne = #Zi = ni=;+i "-j- 

1031 Writing Ij for the trivial character on Gj, put 

Zr -.^ Gi(E) ■ ■ ■ (E) Gr-l <E {Ir} <E) ■ ■ ■ (E) {Ik} 

= {k e G : fi{g) = 1 V.g e Zr-i} 

1032 and 

r(K) :— max{j : k ^ Zj}. 

1033 The following orthogonality property of characters: 



#G if k' = k" 
otherwise. 



E^'(5)'^"(ff)* = 
1034 leads to the relation 

Nr, if K € Zr+l 



0, otherwise. 



1035 We denote this quantity, as a function of k, by NrS^ 



+1 

35 



1036 Define the function : G — )• C by setting fe{g) = 1 if £{g) = i and fe{g) = otherwise. Then, 

g:e{g)=i 

1037 Our assumption that s{g) = se(^g) imphes that s{g) = X^^^^ S(,fe,{g). Since k G Zg if and only if r{K) + 1 < it 

1038 follows by linearity that 

fe+i 

k+l fe+1 

= siNi_i- Y, siNt 

e=r(K.) + l t=r{K) 
k k 

= X] ^i+i^^ ~ Y, ^^^^ 

k 

e=r{K) 

1039 where we used the convention Nk+i = 0. 

1040 Turning to q, we have q{g) = qi(^g) for g ^ iAq and g(idG) = 9fe+i = ~ By the same argument 

1041 as above, 

fc+i fc+i 
^('^) = Y^ qeNe-i - Yj ^^^^ 

£=r(«) + l e=r{K) 

k+l 

= Y, - Ne) - qr(k)^rik) 

£=r(«) + l 

k k 

= Yl lii^i-i -^e)-Y^ qe{Ne-i - Ne) - qr(k)Nr(k) 
e=r{K,)+i e=i 

r{K) 

= -Y, lii^i--^ - - qr(k)Nr(k) 

= - Y, qe{Ne-i - Ne) - gr(fe)^r(fe)-i- 



1042 Lastly, for an arbitrary function fi we need to evaluate 

1043 We do that by using the following lemma that follows immediately from orthogonality of characters. 

1044 Lemma G.l. Let H and K he two finite Abelian groups. For f : H ^ K ^ C, 

2 



Y 

KeH 



Y f{h,k)K{h) 

{h,k)&H®K 



#hY 



keK 
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1045 Using lemma G.l applied to the decomposition of G as Zr 'S) Zr-i, we get 

1046 Further decomposing as Zr 'Si Gr and Zr (g) Gr, and using Nr+i — rirNr gives 



E 

K:r{K,)—r 



Ml 



E iH«-EiiHi^ 



K.eZr+1 

Nr+1 

#G 



KeZr 



Nr 



E IE MM I -|^E| E M..^) 



#G 



/ 

E 



gez,. 



1^ (emc^m) 
^ E E M^;/^-)) 1 

heGr zez,. J I 



2 



1047 To turn the remaining sums into averages, we need to pull out a factor of NrN^, leaving us with UrNr+iNrN^ 



1048 rifci — ifG^- Therefore, recalling that 



= ]^ E f E E /^(ff/^-)) - E E 



1049 we have 



5] Ml^#Gxv^ir). 

K:r{K)—r 

The theorem follows once we note that 

: r{K) = r} = \ Z,) = 7V,+i - iV, 
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